bead.config

Configuration system: didactic models for TOML/YAML pipeline orchestration.

All configuration modules are documented here. See the Configuration Guide for usage examples.

config

Configuration system for the bead pipeline.

Provides configuration models, default settings, and named profiles for development, testing, and production environments.

DEFAULT_CONFIG = BeadConfig(profile='default', paths=(PathsConfig()), resources=(ResourceConfig()), templates=(TemplateConfig()), items=(ItemConfig()), lists=(ListConfig()), deployment=(DeploymentConfig()), active_learning=(ActiveLearningConfig()), logging=(LoggingConfig())) module-attribute

The default configuration instance.

DEV_CONFIG = BeadConfig(profile='dev', paths=(PathsConfig(data_dir=(Path('data')), output_dir=(Path('output')), cache_dir=(Path('.cache')), temp_dir=(Path(gettempdir()) / 'bead_dev'), create_dirs=True)), resources=(ResourceConfig(cache_external=False)), templates=(TemplateConfig(filling_strategy='exhaustive', batch_size=100, stream_mode=False)), items=(ItemConfig(model=(ModelConfig(provider='huggingface', model_name='gpt2', batch_size=4, device='cpu')), parallel_processing=False)), lists=(ListConfig(num_lists=1)), deployment=(DeploymentConfig()), active_learning=(ActiveLearningConfig(forced_choice_model=(ForcedChoiceModelConfig(num_epochs=1, batch_size=8, learning_rate=2e-05)), trainer=(TrainerConfig(epochs=1)))), logging=(LoggingConfig(level='DEBUG', console=True))) module-attribute

Development configuration profile.

PROD_CONFIG = BeadConfig(profile='prod', paths=(PathsConfig(data_dir=(Path('/var/bead/data').absolute()), output_dir=(Path('/var/bead/output').absolute()), cache_dir=(Path('/var/bead/cache').absolute()), temp_dir=(Path('/var/bead/temp').absolute()), create_dirs=True)), resources=(ResourceConfig(cache_external=True)), templates=(TemplateConfig(filling_strategy='exhaustive', batch_size=10000, stream_mode=True)), items=(ItemConfig(model=(ModelConfig(provider='huggingface', model_name='gpt2', batch_size=32, device='cuda')), parallel_processing=True, num_workers=8)), lists=(ListConfig(num_lists=1)), deployment=(DeploymentConfig(apply_material_design=True, include_demographics=True, include_attention_checks=True)), active_learning=(ActiveLearningConfig(forced_choice_model=(ForcedChoiceModelConfig(num_epochs=10, batch_size=32, learning_rate=2e-05)), trainer=(TrainerConfig(epochs=10, use_wandb=True)))), logging=(LoggingConfig(level='WARNING', console=False, file=(Path('/var/log/bead/app.log'))))) module-attribute

Production configuration profile.

PROFILES: dict[str, BeadConfig] = {'default': BeadConfig(), 'dev': DEV_CONFIG, 'prod': PROD_CONFIG, 'test': TEST_CONFIG} module-attribute

Registry of all available configuration profiles.

RealizationKind = Literal['template', 'contextual', 'lm'] module-attribute

Discriminator for which realization strategy a family uses.

TEST_CONFIG = BeadConfig(profile='test', paths=(PathsConfig(data_dir=(Path(gettempdir()) / 'bead_test' / 'data'), output_dir=(Path(gettempdir()) / 'bead_test' / 'output'), cache_dir=(Path(gettempdir()) / 'bead_test' / 'cache'), temp_dir=(Path(gettempdir()) / 'bead_test' / 'temp'), create_dirs=True)), resources=(ResourceConfig(cache_external=False)), templates=(TemplateConfig(filling_strategy='exhaustive', batch_size=10, max_combinations=100, random_seed=42)), items=(ItemConfig(model=(ModelConfig(provider='huggingface', model_name='gpt2', batch_size=1, device='cpu')), parallel_processing=False, num_workers=1)), lists=(ListConfig(num_lists=1, random_seed=42)), deployment=(DeploymentConfig(apply_material_design=False, include_demographics=False, include_attention_checks=False)), active_learning=(ActiveLearningConfig(forced_choice_model=(ForcedChoiceModelConfig(num_epochs=1, batch_size=2, learning_rate=2e-05)), trainer=(TrainerConfig(epochs=1, use_wandb=False)))), logging=(LoggingConfig(level='CRITICAL', console=False))) module-attribute

Test configuration profile.

ActiveLearningConfig

Bases: Model

Configuration for the active-learning subsystem.

Attributes:

Name Type Description
forced_choice_model ForcedChoiceModelConfig

Forced-choice model configuration.

trainer TrainerConfig

Trainer configuration.

loop ActiveLearningLoopConfig

Active-learning loop configuration.

uncertainty_sampler UncertaintySamplerConfig

Uncertainty sampler configuration.

AnchorSpec

Bases: BeadBaseModel

Declarative form of :class:SemanticAnchor for config files.

Pole labels are flattened to two string fields rather than a nested :class:SemanticPoles; build() constructs the embedded model.

Attributes:

Name Type Description
name str

Short identifier.

target_property str

The property being measured.

canonical_prompt str

Reference phrasing.

options tuple[str, ...]

Ordered response options.

is_ordered bool

Whether the response space is ordinal. Defaults to True.

semantic_pole_low str | None

Low-pole label, when ordered. Defaults to None.

semantic_pole_high str | None

High-pole label, when ordered. Defaults to None.

required_span_labels frozenset[str]

Span labels every realization must reference. Defaults to the empty set.

required_keywords frozenset[str]

Keywords every realization must contain. Defaults to the empty set.

embedding_center tuple[float, ...] | None

Pre-computed canonical-prompt embedding. Defaults to None.

max_drift float

Maximum cosine distance for embedding drift. Defaults to 0.3.

description str

Human-readable description.

build() -> SemanticAnchor

Build a :class:SemanticAnchor from this spec.

Returns:

Type Description
SemanticAnchor

Live anchor.

Raises:

Type Description
ValueError

If exactly one of semantic_pole_low and semantic_pole_high is supplied.

BeadConfig

Bases: Model

Main configuration for the bead package.

Attributes:

Name Type Description
profile str

Configuration profile name.

paths PathsConfig

Paths configuration.

resources ResourceConfig

Resources configuration.

templates TemplateConfig

Templates configuration.

items ItemConfig

Items configuration.

lists ListConfig

Lists configuration.

deployment DeploymentConfig

Deployment configuration.

active_learning ActiveLearningConfig

Active learning configuration.

logging LoggingConfig

Logging configuration.

protocol ProtocolConfig

Annotation-protocol configuration.

to_dict() -> dict[str, Any]

Render the configuration as a plain dict.

to_yaml() -> str

Render the configuration as a YAML string.

validate_paths() -> list[str]

Return any path-related validation errors.

Empty list means every required path either exists or is a relative path (in which case the caller is responsible for creating it).

DeploymentConfig

Bases: Model

Configuration for experiment deployment.

Attributes:

Name Type Description
platform str

Deployment platform.

jspsych_version str

jsPsych version.

apply_material_design bool

Use Material Design styling.

include_demographics bool

Include a demographics survey.

include_attention_checks bool

Include attention checks.

jatos_export bool

Export to JATOS.

distribution_strategy ListDistributionStrategy

List distribution strategy for batch experiments.

DriftConfig

Bases: BeadBaseModel

Configuration for the drift guard applied to a protocol.

Every realized prompt runs through one shared :class:~bead.protocol.DriftGuard configured by this section.

Attributes:

Name Type Description
min_length int

Minimum non-whitespace length for the structural validator. Defaults to 15.

require_question_mark bool

Whether a trailing ? is required. Defaults to True.

keyword_case_sensitive bool

Whether structural keyword checks are case-sensitive. Defaults to False.

embedding_max_distance float | None

Cosine-distance ceiling for the embedding validator. None defers to each anchor's max_drift. Defaults to None.

enable_embedding bool

Whether to add an :class:EmbeddingDriftValidator. Requires an embedding adapter at build time. Defaults to False.

enable_perplexity bool

Whether to add a :class:PerplexityDriftValidator. Requires a perplexity adapter at build time. Defaults to False.

max_perplexity float

Perplexity ceiling for the perplexity validator. Defaults to 100.0.

build(*, embedding_adapter: EmbeddingAdapter | None = None, perplexity_adapter: PerplexityAdapter | None = None) -> DriftGuard

Build a :class:DriftGuard with structural + optional checks.

Parameters:

Name Type Description Default
embedding_adapter EmbeddingAdapter | None

Required when :attr:enable_embedding is True. Defaults to None.

None
perplexity_adapter PerplexityAdapter | None

Required when :attr:enable_perplexity is True. Defaults to None.

None

Returns:

Type Description
DriftGuard

Live composite drift validator.

Raises:

Type Description
ValueError

If a validator is enabled but its adapter was not supplied.

FamilySpec

Bases: BeadBaseModel

Declarative form of :class:QuestionFamily for config files.

Attributes:

Name Type Description
anchor AnchorSpec

The anchor declaration. Built into a :class:SemanticAnchor at build time.

realization_kind RealizationKind

Which realization strategy to use.

template str | None

Used when realization_kind="template". None defers to the anchor's canonical prompt.

variants tuple[TemplateVariantSpec, ...]

Used when realization_kind="contextual". Empty tuple is invalid for that kind.

fallback str | None

Fallback template used when no variant matches. None defers to the anchor's canonical prompt.

condition_name str

Registered predicate name controlling family applicability. Defaults to "always".

depends_on tuple[str, ...]

Names of anchors whose responses must precede this family in the protocol. Defaults to the empty tuple.

fallback_on_drift bool

Whether to fall back to the canonical prompt on drift failure. Defaults to True.

build(*, drift_guard: DriftGuard, lm_client: LMClient | None, lm_model_name: str, cache: ModelOutputCache | None, lm_temperature: float, lm_max_tokens: int) -> QuestionFamily

Build a :class:QuestionFamily from this spec.

Parameters:

Name Type Description Default
drift_guard DriftGuard

Shared drift guard for the protocol.

required
lm_client LMClient | None

LM backend; required when realization_kind == "lm".

required
lm_model_name str

Cache-key prefix for LM realizations.

required
cache ModelOutputCache | None

Output cache for LM realizations.

required
lm_temperature float

Sampling temperature for LM realizations.

required
lm_max_tokens int

Maximum response length for LM realizations.

required

Returns:

Type Description
QuestionFamily

Live family.

ItemConfig

Bases: Model

Configuration for item generation.

Attributes:

Name Type Description
model ModelConfig

Model configuration.

apply_constraints bool

Whether to apply model-based constraints.

track_metadata bool

Whether to track item metadata.

parallel_processing bool

Whether to use parallel processing.

num_workers int

Number of workers for parallel processing (must be > 0).

ListConfig

Bases: Model

Configuration for list partitioning.

Attributes:

Name Type Description
partitioning_strategy str

Strategy name.

num_lists int

Number of lists to create.

items_per_list int | None

Items per list.

balance_by tuple[str, ...]

Fields to balance on.

ensure_uniqueness bool

Whether items must be unique across lists.

random_seed int | None

Random seed for reproducibility.

batch_constraints tuple[BatchConstraintConfig, ...] | None

Batch-level constraints to apply across all lists.

LoggingConfig

Bases: Model

Configuration for logging.

Attributes:

Name Type Description
level Literal['DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL']

Log level.

format str

Log format string.

file Path | None

Log file path.

console bool

Whether to log to console.

ModelConfig

Bases: Model

Configuration for language models.

Attributes:

Name Type Description
provider Literal['huggingface', 'openai', 'anthropic']

Model provider.

model_name str

Model identifier.

batch_size int

Inference batch size (must be > 0).

device Literal['cpu', 'cuda', 'mps']

Device to use for computation.

max_length int

Maximum sequence length (must be > 0).

temperature float

Sampling temperature (must be >= 0).

cache_outputs bool

Whether to cache model outputs.

PathsConfig

Bases: Model

Configuration for filesystem paths.

Attributes:

Name Type Description
data_dir Path

Base directory for data files.

output_dir Path

Base directory for outputs.

cache_dir Path

Cache directory.

temp_dir Path | None

Temporary directory; None defers to the system default.

create_dirs bool

Whether to create directories if they don't exist.

ProtocolConfig

Bases: BeadBaseModel

Top-level annotation-protocol stage configuration.

Plugs into :class:~bead.config.config.BeadConfig as the protocol field. Declares the families, drift settings, and LM defaults for an annotation protocol that can be loaded from YAML or TOML and materialized via :meth:build.

Attributes:

Name Type Description
name str

Descriptive protocol name. Defaults to empty.

families tuple[FamilySpec, ...]

Declarative family specs in protocol order. Defaults to the empty tuple.

drift DriftConfig

Drift-guard configuration shared by all families. Defaults to a structural-only guard with the standard defaults.

lm_model_name str

Cache-key prefix for LM realizations. Used when any family has realization_kind="lm". Defaults to empty (forces the caller to set it explicitly when LM realizations are used).

lm_temperature float

Default sampling temperature for LM realizations. Defaults to 0.3.

lm_max_tokens int

Default maximum response length for LM realizations. Defaults to 200.

build(*, lm_client: LMClient | None = None, cache: ModelOutputCache | None = None, embedding_adapter: EmbeddingAdapter | None = None, perplexity_adapter: PerplexityAdapter | None = None) -> AnnotationProtocol

Materialize the configured protocol.

Parameters:

Name Type Description Default
lm_client LMClient | None

LM backend, required if any family declares realization_kind="lm". Defaults to None.

None
cache ModelOutputCache | None

Output cache for LM realizations. Defaults to None.

None
embedding_adapter EmbeddingAdapter | None

Required when drift.enable_embedding=True. Defaults to None.

None
perplexity_adapter PerplexityAdapter | None

Required when drift.enable_perplexity=True. Defaults to None.

None

Returns:

Type Description
AnnotationProtocol

Live protocol with every family materialized in declared order.

Raises:

Type Description
ValueError

If a required runtime dependency is not supplied or a family declares an unknown realization kind.

ResourceConfig

Bases: Model

Configuration for external resources.

Attributes:

Name Type Description
lexicon_path Path | None

Path to lexicon file.

templates_path Path | None

Path to templates file.

constraints_path Path | None

Path to constraints file.

external_adapters tuple[str, ...]

External adapters to enable.

cache_external bool

Whether to cache external resource lookups.

SlotStrategyConfig

Bases: Model

Configuration for a single slot's filling strategy.

Attributes:

Name Type Description
strategy Literal['exhaustive', 'random', 'stratified', 'mlm']

Filling strategy.

sample_size int | None

Sample size for random or stratified strategies.

stratify_by str | None

Feature name to stratify by (stratified strategy only).

beam_size int | None

Beam size for MLM strategy.

TemplateConfig

Bases: Model

Configuration for template filling.

Attributes:

Name Type Description
filling_strategy Literal['exhaustive', 'random', 'stratified', 'mlm', 'mixed']

Strategy for filling templates.

batch_size int

Batch size for filling operations (must be > 0).

max_combinations int | None

Maximum combinations to generate.

random_seed int | None

Random seed for reproducibility.

stream_mode bool

Use streaming for large templates.

use_csp_solver bool

Use CSP solver for multi-slot constraints.

mlm_model_name str | None

HuggingFace model name for MLM filling.

mlm_beam_size int

Beam search width for MLM (> 0).

mlm_fill_direction Literal[...]

Direction for filling slots.

mlm_custom_order tuple[int, ...] | None

Custom slot fill order for MLM.

mlm_top_k int

Top-k candidates per slot in MLM (> 0).

mlm_device str

Device for MLM inference.

mlm_cache_enabled bool

Enable MLM prediction caching.

mlm_cache_dir Path | None

Directory for MLM prediction cache.

slot_strategies dict[str, SlotStrategyConfig] | None

Per-slot strategy configuration for mixed filling.

TemplateVariantSpec

Bases: BeadBaseModel

Declarative form of :class:TemplateVariant for config files.

Attributes:

Name Type Description
template str

Question template, possibly containing [[label]] references.

condition_name str

Name of a registered context predicate. Looked up via :func:bead.protocol.context.get_context_predicate at build time. Defaults to "always".

priority int

Higher-priority variants are tried first. Defaults to 0.

description str

Human-readable description. Defaults to empty.

build() -> TemplateVariant

Build a :class:TemplateVariant from this spec.

Returns:

Type Description
TemplateVariant

Live variant with the named predicate resolved.

Raises:

Type Description
KeyError

If condition_name is not registered.

get_default_config() -> BeadConfig

Return a fresh default BeadConfig.

didactic Models are frozen; this constructor builds a new instance each call so callers can use with_(...) to derive overrides.

get_profile(name: str) -> BeadConfig

Return the configuration profile registered under name.

didactic Models are frozen, so the returned instance is shared with the registry; callers can pass it to with_(...) to derive overrides without affecting other consumers.

Raises:

Type Description
ValueError

If name is not a registered profile.

list_profiles() -> list[str]

Return the registered profile names, sorted.

load_config(config_path: Path | str | None = None, profile: str = 'default', **overrides: Any) -> BeadConfig

Load configuration from YAML file with optional overrides.

Precedence (lowest to highest): 1. Profile defaults 2. YAML file values 3. Keyword overrides

Parameters:

Name Type Description Default
config_path Path | str | None

Path to YAML config file. If None, uses profile defaults.

None
profile str

Profile to use as base (default, dev, prod, test).

'default'
**overrides Any

Direct overrides for config values.

{}

Returns:

Type Description
BeadConfig

Loaded and merged configuration.

Raises:

Type Description
FileNotFoundError

If config_path is specified but doesn't exist.

YAMLError

If YAML file is malformed.

ValidationError

If configuration is invalid.

Examples:

>>> config = load_config(profile="dev")
>>> config.profile
'dev'
>>> config = load_config(config_path="config.yaml", logging__level="DEBUG")
>>> config.logging.level
'DEBUG'

load_from_env(prefix: str = 'BEAD_') -> dict[str, Any]

Load configuration values from environment variables.

Converts environment variables with the given prefix to a nested configuration dictionary.

Parameters:

Name Type Description Default
prefix str

Environment variable prefix to filter on.

'BEAD_'

Returns:

Type Description
dict[str, Any]

Nested configuration dictionary from environment.

Examples:

>>> # With env var: BEAD_LOGGING__LEVEL=DEBUG
>>> load_from_env()
{'logging': {'level': 'DEBUG'}}

load_yaml_file(path: Path | str) -> dict[str, Any]

Load YAML file and return as dictionary.

Parameters:

Name Type Description Default
path Path | str

Path to YAML file.

required

Returns:

Type Description
dict[str, Any]

Parsed YAML content.

Raises:

Type Description
FileNotFoundError

If file doesn't exist.

YAMLError

If YAML is malformed.

merge_configs(base: dict[str, Any], override: dict[str, Any]) -> dict[str, Any]

Deep merge two configuration dictionaries.

Recursively merges override into base, with override values taking precedence.

Parameters:

Name Type Description Default
base dict[str, Any]

Base configuration dictionary.

required
override dict[str, Any]

Override configuration dictionary.

required

Returns:

Type Description
dict[str, Any]

Merged configuration dictionary.

Examples:

>>> base = {"a": 1, "b": {"c": 2}}
>>> override = {"b": {"d": 3}}
>>> merge_configs(base, override)
{'a': 1, 'b': {'c': 2, 'd': 3}}

save_yaml(config: BeadConfig, path: Path | str, include_defaults: bool = False, create_dirs: bool = True) -> None

Save configuration to YAML file.

Parameters:

Name Type Description Default
config BeadConfig

Configuration to save.

required
path Path | str

Path where YAML file should be saved.

required
include_defaults bool

If True, include all fields even if they have default values.

False
create_dirs bool

If True, create parent directories if they don't exist.

True

Raises:

Type Description
IOError

If file cannot be written.

FileNotFoundError

If create_dirs is False and parent directory doesn't exist.

Examples:

>>> from pathlib import Path
>>> from bead.config import get_default_config
>>> config = get_default_config()
>>> save_yaml(config, Path("config.yaml"))

to_yaml(config: BeadConfig, include_defaults: bool = False) -> str

Serialize configuration to YAML string.

Parameters:

Name Type Description Default
config BeadConfig

Configuration to serialize.

required
include_defaults bool

If True, include all fields even if they have default values. If False, only include non-default values.

False

Returns:

Type Description
str

YAML representation of configuration.

Examples:

>>> from bead.config import get_default_config
>>> config = get_default_config()
>>> yaml_str = to_yaml(config)
>>> 'profile: default' in yaml_str
True

validate_config(config: BeadConfig) -> list[str]

Perform pre-flight validation on configuration.

Checks: - All paths exist (if absolute paths are specified) - Resource paths exist (if specified) - Model configurations are compatible - Training configurations are valid - No conflicting settings

Parameters:

Name Type Description Default
config BeadConfig

Configuration to validate.

required

Returns:

Type Description
list[str]

List of validation errors. Empty if valid.

Examples:

>>> from bead.config import get_default_config
>>> config = get_default_config()
>>> errors = validate_config(config)
>>> len(errors)
0

Annotation Protocol Configuration

Declarative TOML/YAML configuration for the annotation-protocol layer: anchor specs, family specs, drift settings, and protocol composition.

protocol

Configuration for the annotation-protocol layer.

Declares :class:ProtocolConfig (the top-level stage config that plugs into :class:~bead.config.config.BeadConfig) along with the declarative specs (:class:AnchorSpec, :class:TemplateVariantSpec, :class:FamilySpec, :class:DriftConfig) that materialize into runtime :class:~bead.protocol.SemanticAnchor, :class:~bead.protocol.QuestionFamily, and :class:~bead.protocol.AnnotationProtocol objects.

Configuration is declarative: anchors, drift thresholds, realization strategies, and protocol composition are written in YAML or TOML, and :meth:ProtocolConfig.build produces the live objects. Runtime-only parameters (LM clients, embedding adapters, output caches) are passed to :meth:build rather than stored in the config.

Predicates are referenced by registered name; callables cannot be serialized. Register predicates in the :mod:~bead.protocol.context registry at import time, then refer to them by name from a :class:FamilySpec or :class:TemplateVariantSpec.

RealizationKind = Literal['template', 'contextual', 'lm'] module-attribute

Discriminator for which realization strategy a family uses.

TemplateVariantSpec

Bases: BeadBaseModel

Declarative form of :class:TemplateVariant for config files.

Attributes:

Name Type Description
template str

Question template, possibly containing [[label]] references.

condition_name str

Name of a registered context predicate. Looked up via :func:bead.protocol.context.get_context_predicate at build time. Defaults to "always".

priority int

Higher-priority variants are tried first. Defaults to 0.

description str

Human-readable description. Defaults to empty.

build() -> TemplateVariant

Build a :class:TemplateVariant from this spec.

Returns:

Type Description
TemplateVariant

Live variant with the named predicate resolved.

Raises:

Type Description
KeyError

If condition_name is not registered.

AnchorSpec

Bases: BeadBaseModel

Declarative form of :class:SemanticAnchor for config files.

Pole labels are flattened to two string fields rather than a nested :class:SemanticPoles; build() constructs the embedded model.

Attributes:

Name Type Description
name str

Short identifier.

target_property str

The property being measured.

canonical_prompt str

Reference phrasing.

options tuple[str, ...]

Ordered response options.

is_ordered bool

Whether the response space is ordinal. Defaults to True.

semantic_pole_low str | None

Low-pole label, when ordered. Defaults to None.

semantic_pole_high str | None

High-pole label, when ordered. Defaults to None.

required_span_labels frozenset[str]

Span labels every realization must reference. Defaults to the empty set.

required_keywords frozenset[str]

Keywords every realization must contain. Defaults to the empty set.

embedding_center tuple[float, ...] | None

Pre-computed canonical-prompt embedding. Defaults to None.

max_drift float

Maximum cosine distance for embedding drift. Defaults to 0.3.

description str

Human-readable description.

build() -> SemanticAnchor

Build a :class:SemanticAnchor from this spec.

Returns:

Type Description
SemanticAnchor

Live anchor.

Raises:

Type Description
ValueError

If exactly one of semantic_pole_low and semantic_pole_high is supplied.

DriftConfig

Bases: BeadBaseModel

Configuration for the drift guard applied to a protocol.

Every realized prompt runs through one shared :class:~bead.protocol.DriftGuard configured by this section.

Attributes:

Name Type Description
min_length int

Minimum non-whitespace length for the structural validator. Defaults to 15.

require_question_mark bool

Whether a trailing ? is required. Defaults to True.

keyword_case_sensitive bool

Whether structural keyword checks are case-sensitive. Defaults to False.

embedding_max_distance float | None

Cosine-distance ceiling for the embedding validator. None defers to each anchor's max_drift. Defaults to None.

enable_embedding bool

Whether to add an :class:EmbeddingDriftValidator. Requires an embedding adapter at build time. Defaults to False.

enable_perplexity bool

Whether to add a :class:PerplexityDriftValidator. Requires a perplexity adapter at build time. Defaults to False.

max_perplexity float

Perplexity ceiling for the perplexity validator. Defaults to 100.0.

build(*, embedding_adapter: EmbeddingAdapter | None = None, perplexity_adapter: PerplexityAdapter | None = None) -> DriftGuard

Build a :class:DriftGuard with structural + optional checks.

Parameters:

Name Type Description Default
embedding_adapter EmbeddingAdapter | None

Required when :attr:enable_embedding is True. Defaults to None.

None
perplexity_adapter PerplexityAdapter | None

Required when :attr:enable_perplexity is True. Defaults to None.

None

Returns:

Type Description
DriftGuard

Live composite drift validator.

Raises:

Type Description
ValueError

If a validator is enabled but its adapter was not supplied.

FamilySpec

Bases: BeadBaseModel

Declarative form of :class:QuestionFamily for config files.

Attributes:

Name Type Description
anchor AnchorSpec

The anchor declaration. Built into a :class:SemanticAnchor at build time.

realization_kind RealizationKind

Which realization strategy to use.

template str | None

Used when realization_kind="template". None defers to the anchor's canonical prompt.

variants tuple[TemplateVariantSpec, ...]

Used when realization_kind="contextual". Empty tuple is invalid for that kind.

fallback str | None

Fallback template used when no variant matches. None defers to the anchor's canonical prompt.

condition_name str

Registered predicate name controlling family applicability. Defaults to "always".

depends_on tuple[str, ...]

Names of anchors whose responses must precede this family in the protocol. Defaults to the empty tuple.

fallback_on_drift bool

Whether to fall back to the canonical prompt on drift failure. Defaults to True.

build(*, drift_guard: DriftGuard, lm_client: LMClient | None, lm_model_name: str, cache: ModelOutputCache | None, lm_temperature: float, lm_max_tokens: int) -> QuestionFamily

Build a :class:QuestionFamily from this spec.

Parameters:

Name Type Description Default
drift_guard DriftGuard

Shared drift guard for the protocol.

required
lm_client LMClient | None

LM backend; required when realization_kind == "lm".

required
lm_model_name str

Cache-key prefix for LM realizations.

required
cache ModelOutputCache | None

Output cache for LM realizations.

required
lm_temperature float

Sampling temperature for LM realizations.

required
lm_max_tokens int

Maximum response length for LM realizations.

required

Returns:

Type Description
QuestionFamily

Live family.

ProtocolConfig

Bases: BeadBaseModel

Top-level annotation-protocol stage configuration.

Plugs into :class:~bead.config.config.BeadConfig as the protocol field. Declares the families, drift settings, and LM defaults for an annotation protocol that can be loaded from YAML or TOML and materialized via :meth:build.

Attributes:

Name Type Description
name str

Descriptive protocol name. Defaults to empty.

families tuple[FamilySpec, ...]

Declarative family specs in protocol order. Defaults to the empty tuple.

drift DriftConfig

Drift-guard configuration shared by all families. Defaults to a structural-only guard with the standard defaults.

lm_model_name str

Cache-key prefix for LM realizations. Used when any family has realization_kind="lm". Defaults to empty (forces the caller to set it explicitly when LM realizations are used).

lm_temperature float

Default sampling temperature for LM realizations. Defaults to 0.3.

lm_max_tokens int

Default maximum response length for LM realizations. Defaults to 200.

build(*, lm_client: LMClient | None = None, cache: ModelOutputCache | None = None, embedding_adapter: EmbeddingAdapter | None = None, perplexity_adapter: PerplexityAdapter | None = None) -> AnnotationProtocol

Materialize the configured protocol.

Parameters:

Name Type Description Default
lm_client LMClient | None

LM backend, required if any family declares realization_kind="lm". Defaults to None.

None
cache ModelOutputCache | None

Output cache for LM realizations. Defaults to None.

None
embedding_adapter EmbeddingAdapter | None

Required when drift.enable_embedding=True. Defaults to None.

None
perplexity_adapter PerplexityAdapter | None

Required when drift.enable_perplexity=True. Defaults to None.

None

Returns:

Type Description
AnnotationProtocol

Live protocol with every family materialized in declared order.

Raises:

Type Description
ValueError

If a required runtime dependency is not supplied or a family declares an unknown realization kind.