bead.protocol¶

Annotation-protocol primitives: anchors as types, contexts as dependent indices, realization strategies as computational content, and drift validators as type-checkers. On top of these, question families and protocols compose into a sequenced, conditional annotation pipeline.

Anchors and Response Spaces¶

`anchor` ¶

Semantic anchors: the type-level specification of a question.

A :class:SemanticAnchor defines what a question measures, independently of how it is phrased. It is the invariant that any realization must preserve, the type in the dependent type Question(ctx).

The anchor includes:

a canonical prompt: the reference phrasing
a response space: the set of valid responses and their ordering
structural constraints: keywords, span references, and embedding bounds that any realization must satisfy

`ScaleType` ¶

Bases: StrEnum

Classification of response-scale structure.

Attributes:

Name	Type	Description
`BINARY`	`str`	Two unordered options with fixed, content-bearing labels (for example `("no", "yes")`). Modeled via Bernoulli likelihoods. Wire value: `"binary"`.
`ORDINAL`	`str`	Ordered options forming an ordinal scale. Wire value: `"ordinal"`.
`NOMINAL`	`str`	Unordered multi-option scale. Wire value: `"nominal"`.
`FORCED_CHOICE`	`str`	N-alternative forced choice with positional response labels (for example `("first", "second")`). The per-item content of the alternatives varies between items and lives on the :class:`~bead.items.item.Item` itself; the encoding's labels identify which alternative was chosen. Wire value: `"forced_choice"`.

`SemanticPoles` ¶

Bases: BeadBaseModel

Pole labels for an ordered response scale.

Ordered scales are characterized by their two participant-facing endpoint labels (for example low="definitely no" and high="definitely yes"). Unordered scales have no poles and use None in place of an instance of this model.

Attributes:

Name	Type	Description
`low`	`str`	Label of the low end of the scale.
`high`	`str`	Label of the high end of the scale.

Examples:

>>> poles = SemanticPoles(low="definitely no", high="definitely yes")
>>> poles.as_tuple()
('definitely no', 'definitely yes')

`as_tuple() -> tuple[str, str]` ¶

Return (low, high) as a 2-tuple.

Returns:

Type	Description
`tuple[str, str]`	The pole labels as a Python tuple.

`ResponseSpace` ¶

Bases: BeadBaseModel

The space of valid responses for a question.

Attributes:

Name	Type	Description
`options`	`tuple[str, ...]`	Ordered response options.
`is_ordered`	`bool`	Whether the options form an ordinal scale. Defaults to `True`.
`semantic_poles`	`SemanticPoles \| None`	Pole labels for ordered scales (for example `low="never"`, `high="always"`). `None` for unordered (categorical) response spaces. Defaults to `None`.
`scale_type`	`ScaleType \| None`	Explicit scale-type classification. `None` (default) leaves :func:`~bead.protocol.encode_response_space` to infer the kind from `options` and `is_ordered`. Set explicitly to :attr:`ScaleType.FORCED_CHOICE` when the labels are positional and the per-item alternatives vary across items (the 2-option-unordered shape that the inference rule would otherwise classify as `BINARY`).

Examples:

>>> rs = ResponseSpace(
...     options=("definitely no", "probably no", "unsure",
...              "probably yes", "definitely yes"),
...     is_ordered=True,
...     semantic_poles=SemanticPoles(
...         low="definitely no", high="definitely yes",
...     ),
... )
>>> len(rs)
5
>>> "probably yes" in rs
True

`len() -> int` ¶

Return the number of response options.

`contains(item: str) -> bool` ¶

Return whether item is one of the response options.

Parameters:

Name	Type	Description	Default
`item`	`str`	Candidate response label.	required

Returns:

Type	Description
`bool`	`True` when `item` is a registered option.

`SemanticAnchor` ¶

Bases: BeadBaseModel

Type-level specification of what a question measures.

Any realization of a question must preserve the anchor's semantic content. The anchor is the type; a realized prompt string is the value.

Attributes:

Name	Type	Description
`name`	`str`	Short identifier (for example `"completion"`).
`target_property`	`str`	The property being measured (for example `"telicity"`).
`canonical_prompt`	`str`	Reference phrasing of the question. Serves as both documentation and the default template.
`response_space`	`ResponseSpace`	Valid responses.
`required_span_labels`	`frozenset[str]`	Span labels that must appear in any realization (for example `frozenset({"situation"})`). Defaults to the empty set.
`required_keywords`	`frozenset[str]`	Keywords that must appear in any realization to preserve semantic content. Used by :class:`StructuralDriftValidator`. Defaults to the empty set.
`embedding_center`	`tuple[float, ...] \| None`	Pre-computed embedding of the canonical prompt for drift validation via cosine distance. `None` means the embedding is computed on demand by the validator. Defaults to `None`.
`max_drift`	`float`	Maximum allowed cosine distance.
`description`	`str`	Human-readable description.

Examples:

>>> rs = ResponseSpace(
...     options=("no", "yes"), is_ordered=False
... )
>>> anchor = SemanticAnchor(
...     name="dynamicity",
...     target_property="dynamic",
...     canonical_prompt="Is anything changing during [[situation]]?",
...     response_space=rs,
...     required_span_labels=frozenset({"situation"}),
...     required_keywords=frozenset({"changing"}),
... )
>>> anchor.name
'dynamicity'

`from_response_options(*, name: str, target_property: str, canonical_prompt: str, options: tuple[str, ...], is_ordered: bool = True, semantic_poles: SemanticPoles | None = None, required_span_labels: frozenset[str] = frozenset(), required_keywords: frozenset[str] = frozenset(), embedding_center: tuple[float, ...] | None = None, max_drift: float = 0.3, description: str = '') -> Self` `classmethod` ¶

Build an anchor from a flat list of response options.

Convenience constructor for the common case in which a :class:ResponseSpace is built inline from its options.

Parameters:

Name	Type	Description	Default
`name`	`str`	Short identifier.	required
`target_property`	`str`	The property being measured.	required
`canonical_prompt`	`str`	Reference phrasing.	required
`options`	`tuple[str, ...]`	Ordered response options.	required
`is_ordered`	`bool`	Whether the options form an ordinal scale. Defaults to `True`.	`True`
`semantic_poles`	`SemanticPoles \| None`	Pole labels for ordered scales. Defaults to `None`.	`None`
`required_span_labels`	`frozenset[str]`	Span labels required in every realization. Defaults to the empty set.	`frozenset()`
`required_keywords`	`frozenset[str]`	Keywords required in every realization. Defaults to the empty set.	`frozenset()`
`embedding_center`	`tuple[float, ...] \| None`	Pre-computed canonical-prompt embedding. Defaults to `None`.	`None`
`max_drift`	`float`	Maximum allowed cosine distance. Defaults to `0.3`.	`0.3`
`description`	`str`	Human-readable description. Defaults to the empty string.	`''`

Returns:

Type	Description
`SemanticAnchor`	A new anchor with an inline-constructed response space.

Examples:

>>> anchor = SemanticAnchor.from_response_options(
...     name="completion",
...     target_property="telicity",
...     canonical_prompt="Does [[situation]] reach an endpoint?",
...     options=("definitely no", "probably no", "unsure",
...              "probably yes", "definitely yes"),
...     is_ordered=True,
...     semantic_poles=SemanticPoles(
...         low="definitely no", high="definitely yes"
...     ),
...     required_span_labels=frozenset({"situation"}),
... )
>>> anchor.response_space.is_ordered
True

Encoding¶

`encoding` ¶

Response-space encodings for probabilistic modeling.

Bridges the annotation-side :class:~bead.protocol.anchor.ResponseSpace representation and a model-ready description of a response scale, providing the index-to-label mapping and scale-type metadata that downstream modeling code needs.

This module supports three response scale types:

Binary: two unordered options (for example ("change", "no_change")). Naturally modeled via Bernoulli likelihoods.
Ordinal: ordered options on a Likert-like scale (for example a five-point Likert scale from "definitely no" to "definitely yes"). Naturally modeled via cumulative-link (ordered logistic) likelihoods.
Nominal: unordered multi-option (for example a categorical choice among unordered alternatives). Naturally modeled via softmax categorical likelihoods.

The encoding itself is likelihood-agnostic. It does not select a likelihood family; downstream modeling code (for example :mod:bead.active_learning.models) chooses the appropriate model class based on the scale type.

`ScaleType` ¶

Bases: StrEnum

Classification of response-scale structure.

Attributes:

Name	Type	Description
`BINARY`	`str`	Two unordered options with fixed, content-bearing labels (for example `("no", "yes")`). Modeled via Bernoulli likelihoods. Wire value: `"binary"`.
`ORDINAL`	`str`	Ordered options forming an ordinal scale. Wire value: `"ordinal"`.
`NOMINAL`	`str`	Unordered multi-option scale. Wire value: `"nominal"`.
`FORCED_CHOICE`	`str`	N-alternative forced choice with positional response labels (for example `("first", "second")`). The per-item content of the alternatives varies between items and lives on the :class:`~bead.items.item.Item` itself; the encoding's labels identify which alternative was chosen. Wire value: `"forced_choice"`.

`ResponseEncoding` ¶

Bases: BeadBaseModel

Encoding of a response space for probabilistic modeling.

Bridges the annotation-side :class:ResponseSpace and a modeling-side representation, providing the index-to-label mapping and scale-type metadata needed by both systems.

Attributes:

Name	Type	Description
`name`	`str`	Identifier for this encoding (typically the anchor name, for example `"completion"`).
`n_levels`	`int`	Number of response categories. Must equal `len(labels)`.
`scale_type`	`ScaleType`	Whether the scale is binary, ordinal, or nominal.
`labels`	`tuple[str, ...]`	Human-readable labels for each index, in order.
`semantic_poles`	`SemanticPoles \| None`	The two participant-facing endpoints of the scale, if ordered (for example `SemanticPoles(low="definitely no", high="definitely yes")`). Defaults to `None`.

Examples:

>>> enc = ResponseEncoding(
...     name="completion",
...     n_levels=5,
...     scale_type=ScaleType.ORDINAL,
...     labels=("definitely no", "probably no", "unsure",
...             "probably yes", "definitely yes"),
...     semantic_poles=SemanticPoles(
...         low="definitely no", high="definitely yes"
...     ),
... )
>>> enc.label_to_index("probably yes")
3
>>> enc.index_to_label(0)
'definitely no'
>>> enc.is_ordinal
True

`is_ordinal: bool` `property` ¶

Whether the response scale is ordered.

`is_binary: bool` `property` ¶

Whether the response scale is binary.

`is_nominal: bool` `property` ¶

Whether the response scale is unordered multi-option.

`is_forced_choice: bool` `property` ¶

Whether the response scale uses positional forced-choice labels.

`label_to_index(label: str) -> int` ¶

Convert a response label to its integer index.

Parameters:

Name	Type	Description	Default
`label`	`str`	The response label string.	required

Returns:

Type	Description
`int`	The 0-based index of the label.

Raises:

Type	Description
`ValueError`	If the label is not in the encoding.

`index_to_label(index: int) -> str` ¶

Convert an integer index to its response label.

Parameters:

Name	Type	Description	Default
`index`	`int`	The 0-based index.	required

Returns:

Type	Description
`str`	The response label at that index.

Raises:

Type	Description
`IndexError`	If the index is out of range for this encoding.

`encode_response_space(name: str, response_space: ResponseSpace, *, scale_type: ScaleType | None = None) -> ResponseEncoding` ¶

Build a :class:ResponseEncoding from a :class:ResponseSpace.

This is the primary bridge from the protocol layer to the modeling layer. The resulting encoding shares its labels with the response space and inherits the space's ordering as a :class:ScaleType, unless scale_type is set to override the inferred kind.

Parameters:

Name	Type	Description	Default
`name`	`str`	Name for the encoding (typically the anchor name, for example `"completion"`).	required
`response_space`	`ResponseSpace`	The response space to encode.	required
`scale_type`	`ScaleType \| None`	Override the kind inferred from the response space. Required when declaring a forced-choice encoding, since forced-choice and binary share the "two unordered options" shape but are modeled differently.	`None`

Returns:

Type	Description
`ResponseEncoding`	The modeling-side encoding.

Examples:

>>> rs = ResponseSpace(
...     options=("no", "yes"), is_ordered=False
... )
>>> enc = encode_response_space("dynamicity", rs)
>>> enc.scale_type
<ScaleType.BINARY: 'binary'>
>>> enc.is_binary
True

>>> rs = ResponseSpace(
...     options=("first", "second"), is_ordered=False
... )
>>> enc = encode_response_space(
...     "acceptability", rs, scale_type=ScaleType.FORCED_CHOICE
... )
>>> enc.is_forced_choice
True

Contexts¶

`context` ¶

Annotation contexts: dependent indices for question realization.

A :class:ProtocolContext gathers everything known about the current annotation target into a single immutable object. It is the index in the dependent type Question(ctx): different contexts license different questions and different phrasings.

The context layer is deliberately domain-neutral. It carries sentence-level, target-level, and dependent-level information common to most token- or span-targeted annotation protocols, plus a JSON-shaped metadata map (inherited from :class:~bead.data.base.BeadBaseModel) for domain-specific data that does not fit the standard fields. Domain-specific predicates over the context live in the predicate registry documented at the bottom of this module: callers register named predicates at import time and refer to them by name from :class:~bead.protocol.realization.ContextualTemplateRealization.

`ContextPredicate = Callable[[ProtocolContext], bool]` `module-attribute` ¶

Type alias for predicates over :class:ProtocolContext.

`ContextItem` ¶

Bases: BeadBaseModel

Generic per-token-or-span dependent context.

Captures the structural properties of a single dependent (an argument, an adjunct, a related span, ...) of the annotation target. Domain-specific scalar attributes live in :attr:attributes.

Attributes:

Name	Type	Description
`node_id`	`str`	Identifier from the upstream parse or annotation source. Defaults to the empty string.
`head_lemma`	`str`	Lemma of the dependent head. Defaults to the empty string.
`head_form`	`str`	Surface form of the dependent head. Defaults to the empty string.
`head_upos`	`str`	Universal POS tag of the dependent head. Defaults to the empty string.
`head_position`	`int`	1-based token position of the dependent head. Defaults to `0`.
`span_text`	`str`	Full surface span text of the dependent. Defaults to the empty string.
`span_positions`	`tuple[int, ...]`	1-based token positions in the dependent span. Defaults to the empty tuple.
`is_plural`	`bool`	Whether the dependent head is morphologically plural. Defaults to `False`.
`attributes`	`dict[str, float]`	Domain-specific scalar attributes, keyed by attribute name (for example `{"definiteness": 0.7}`). Defaults to the empty dict.

`attribute(name: str) -> float | None` ¶

Return the value of a named attribute, or None if absent.

Parameters:

Name	Type	Description	Default
`name`	`str`	Attribute name to look up.	required

Returns:

Type	Description
`float \| None`	The attribute value, or `None` when the attribute is not present on this context item.

Examples:

>>> item = ContextItem(
...     attributes={"change_of_state": 4.2,
...                 "instigation": 3.1},
... )
>>> item.attribute("change_of_state")
4.2
>>> item.attribute("absent") is None
True

`ProtocolContext` ¶

Bases: BeadBaseModel

Everything known about the current annotation target.

This is the value that parameterizes the dependent question type. Question families inspect the context to decide which question variant to realize and how to phrase it.

The :meth:with_response method threads an annotator response into the context, supporting dependent products in which later questions condition on earlier answers.

Attributes:

Name	Type	Description
`sentence`	`str`	Full sentence text. Defaults to the empty string.
`tokens`	`tuple[str, ...]`	Sentence tokens, in order. Defaults to the empty tuple.
`tokens_lemma`	`tuple[str, ...]`	Token lemmas, in order. Defaults to the empty tuple.
`tokens_upos`	`tuple[str, ...]`	Universal POS tags, in order. Defaults to the empty tuple.
`target_lemma`	`str`	Lemma of the annotation target's head. Defaults to the empty string.
`target_form`	`str`	Surface form of the target head. Defaults to the empty string.
`target_upos`	`str`	UPOS tag of the target head. Defaults to the empty string.
`target_position`	`int`	1-based token position of the target head. Defaults to `0`.
`target_span_text`	`str`	Full surface span text of the target. Defaults to the empty string.
`target_span_positions`	`tuple[int, ...]`	1-based token positions of the target span. Defaults to the empty tuple.
`dependents`	`tuple[ContextItem, ...]`	Structural dependents of the target. Defaults to the empty tuple.
`previous_responses`	`dict[str, str]`	Annotator responses to earlier questions, keyed by anchor name. Defaults to the empty dict.
`target_id`	`str`	Identifier for the annotation target, for traceability. Defaults to the empty string.
`source_id`	`str`	Identifier for the source document or graph. Defaults to the empty string.

`with_response(question_name: str, response: str) -> Self` ¶

Return a new context with one additional response recorded.

Supports the dependent-product structure: the type of a later question can depend on the value (response) of an earlier question.

Parameters:

Name	Type	Description	Default
`question_name`	`str`	Name of the anchor whose response is being recorded.	required
`response`	`str`	The annotator's response label.	required

Returns:

Type	Description
`ProtocolContext`	A new context whose :attr:`previous_responses` includes `{question_name: response}`.

Examples:

>>> ctx = ProtocolContext(sentence="Mary built a sandcastle.")
>>> ctx2 = ctx.with_response("dynamicity", "yes")
>>> ctx2.previous_responses
{'dynamicity': 'yes'}
>>> ctx.previous_responses
{}

`get_response(question_name: str) -> str | None` ¶

Return the recorded response for a question, or None.

Parameters:

Name	Type	Description	Default
`question_name`	`str`	The anchor name to look up.	required

Returns:

Type	Description
`str \| None`	The recorded response label, or `None` if no response has been threaded for this question.

`register_context_predicate(name: str, predicate: ContextPredicate) -> None` ¶

Callers register their domain-specific predicates at import time. The registered predicates are then available by name to :class:~bead.protocol.realization.ContextualTemplateRealization and other realization strategies that select among variants.

Parameters:

Name	Type	Description	Default
`name`	`str`	Unique predicate name. Re-registering an existing name overwrites the previous predicate.	required
`predicate`	`ContextPredicate`	Callable that returns `True` when the context matches.	required

Examples:

>>> def has_plural_dependent(ctx: ProtocolContext) -> bool:
...     return any(d.is_plural for d in ctx.dependents)
>>> register_context_predicate(
...     "has_plural_dependent", has_plural_dependent
... )
>>> get_context_predicate("has_plural_dependent") is has_plural_dependent
True

`get_context_predicate(name: str) -> ContextPredicate` ¶

Look up a registered predicate by name.

Parameters:

Name	Type	Description	Default
`name`	`str`	The predicate name to look up.	required

Returns:

Type	Description
`ContextPredicate`	The registered predicate.

Raises:

Type	Description
`KeyError`	If no predicate with that name is registered.

`list_context_predicates() -> tuple[str, ...]` ¶

Return the names of all registered context predicates, sorted.

Returns:

Type	Description
`tuple[str, ...]`	All registered predicate names in sorted order.

`always(_ctx: ProtocolContext) -> bool` ¶

Predicate that matches every context.

Used as the catch-all condition for fallback template variants and the default applicability predicate for question families.

Parameters:

Name	Type	Description	Default
`_ctx`	`ProtocolContext`	Ignored.	required

Returns:

Type	Description
`bool`	Always `True`.

Realization Strategies¶

`realization` ¶

Realization strategies: how dependent functions are computed.

A :class:RealizationStrategy maps a :class:~bead.protocol.anchor.SemanticAnchor and a :class:~bead.protocol.context.ProtocolContext to a concrete prompt string. It is the computational content of the dependent function Pi(ctx). Question(ctx).

Three strategies are provided:

:class:TemplateRealization: a fixed template (the simplest strategy and a safe fallback).
:class:ContextualTemplateRealization: rule-based selection from ranked template variants.
:class:LMRealization: prompts a language model to paraphrase the canonical question for the specific context. Should always be paired with a :class:~bead.protocol.drift.DriftGuard to validate that the paraphrase preserves semantic content.

These classes carry callable fields (predicates, LM clients) so they are plain frozen Python classes rather than :class:~bead.data.base.BeadBaseModel subclasses; didactic Models do not accept :class:~collections.abc.Callable field types.

`RealizationStrategy` ¶

Bases: Protocol

Protocol for question realization.

A realization strategy is the computational content of the dependent function Pi(ctx). Question(ctx): it produces a prompt string for a given anchor-and-context pair.

Examples:

A minimal conforming implementation:

>>> class EchoCanonical:
...     def realize(
...         self, anchor, context
...     ):
...         return anchor.canonical_prompt
>>> isinstance(EchoCanonical(), RealizationStrategy)
True

`realize(anchor: SemanticAnchor, context: ProtocolContext) -> str` ¶

Produce a prompt string for the given anchor and context.

Parameters:

Name	Type	Description	Default
`anchor`	`SemanticAnchor`	The semantic invariant to preserve.	required
`context`	`ProtocolContext`	The context to condition on.	required

Returns:

Type	Description
`str`	A prompt string, possibly containing `[[label]]` or `[[label\|transform]]` references.

`TemplateVariant` `dataclass` ¶

A context-conditioned question template.

Parameters:

Name	Type	Description	Default
`template`	`str`	Question template, possibly containing `[[label]]` or `[[label\|transform]]` references.	required
`condition`	`ContextPredicate`	Returns `True` when this variant is appropriate for the context. Variants are evaluated in priority order; the first match wins. Defaults to :func:`always`.	`always`
`priority`	`int`	Higher-priority variants are tried first. Use this to order more-specific variants before less-specific ones. Defaults to `0`.	`0`
`description`	`str`	Human-readable description for experimenters. Defaults to the empty string.	`''`

Attributes:

Name	Type	Description
`template`	`str`	The template string.
`condition`	`ContextPredicate`	Variant-applicability predicate.
`priority`	`int`	Selection priority.
`description`	`str`	Human-readable description.

`TemplateRealization` `dataclass` ¶

Fixed-template realization.

Always returns the same template string regardless of context. The simplest strategy and a safe fallback when context-dependent phrasing is not needed.

Parameters:

Name	Type	Description	Default
`template`	`str \| None`	Template string. When `None`, the anchor's canonical prompt is used at realization time. Defaults to `None`.	`None`

Attributes:

Name	Type	Description
`template`	`str \| None`	The configured template, or `None` to defer to the anchor.

`realize(anchor: SemanticAnchor, context: ProtocolContext) -> str` ¶

Return the configured template or the canonical prompt.

Parameters:

Name	Type	Description	Default
`anchor`	`SemanticAnchor`	The semantic invariant. Its `canonical_prompt` is used when this strategy was constructed without an explicit template.	required
`context`	`ProtocolContext`	The annotation context (unused by this strategy but required by the :class:`RealizationStrategy` protocol).	required

Returns:

Type	Description
`str`	The realized prompt string.

`ContextualTemplateRealization` `dataclass` ¶

Rule-based selection from ranked template variants.

Evaluates variant conditions in descending priority order and returns the template of the first matching variant. Falls back to a configurable fallback template (or the anchor's canonical prompt if none is configured) when no variant matches.

This is the recommended strategy for production use: it gives experimenters fine-grained control over how questions adapt to context while guaranteeing the output is one of a pre-approved set of templates.

Parameters:

Name	Type	Description	Default
`variants`	`tuple[TemplateVariant, ...]`	Candidate templates. They are evaluated in descending priority order; ties are broken by registration order.	required
`fallback`	`str \| None`	Template used when no variant matches. When `None`, the anchor's canonical prompt is used. Defaults to `None`.	`None`

Attributes:

Name	Type	Description
`variants`	`tuple[TemplateVariant, ...]`	The configured variants, sorted by descending priority.
`fallback`	`str \| None`	Fallback template, or `None` to defer to the anchor.

`__post_init__() -> None` ¶

Sort variants by descending priority, stable on ties.

`realize(anchor: SemanticAnchor, context: ProtocolContext) -> str` ¶

Return the first matching variant's template, or the fallback.

Parameters:

Name	Type	Description	Default
`anchor`	`SemanticAnchor`	The semantic invariant.	required
`context`	`ProtocolContext`	The annotation context tested against each variant's condition.	required

Returns:

Type	Description
`str`	The template of the highest-priority matching variant, the configured fallback if none match, or `anchor.canonical_prompt` when no fallback is configured.

`LMClient` ¶

Bases: Protocol

Protocol for language-model completion.

Any object with a complete method matching this signature can serve as an LM backend for :class:LMRealization. The keyword parameters temperature and max_tokens are required, since :class:LMRealization always supplies them.

Examples:

A minimal stub for testing:

>>> class StubClient:
...     def complete(
...         self, prompt: str, *,
...         temperature: float, max_tokens: int,
...     ) -> str:
...         return "Did the event reach an endpoint?"
>>> isinstance(StubClient(), LMClient)
True

`complete(prompt: str, *, temperature: float, max_tokens: int) -> str` ¶

Generate a completion for the given prompt.

Parameters:

Name	Type	Description	Default
`prompt`	`str`	Full prompt including any system context.	required
`temperature`	`float`	Sampling temperature.	required
`max_tokens`	`int`	Maximum response length in tokens.	required

Returns:

Type	Description
`str`	Generated text.

`LMRealization` ¶

LM-based question paraphrasing.

Prompts a language model to rephrase the canonical question for the specific annotation context. The LM receives the sentence, target information, and canonical question as context, and produces a paraphrase that should be more natural for the specific sentence.

This strategy should always be paired with a :class:~bead.protocol.drift.DriftGuard to validate that the paraphrase preserves semantic content.

When cache is supplied (a :class:~bead.items.cache.ModelOutputCache), realized prompts are stored under the (model_name, "lm_completion", prompt=full_prompt) key. Repeated calls with the same anchor-and-context pair avoid redundant LM calls. The cache is the single canonical caching surface across bead; this class does not maintain its own.

Parameters:

Name	Type	Description	Default
`client`	`LMClient`	Language-model backend.	required
`model_name`	`str`	Identifier for the model behind `client`. Used as the cache key prefix.	required
`cache`	`ModelOutputCache \| None`	Output cache shared with the rest of bead. Pass `None` to disable caching. Defaults to `None`.	`None`
`system_prompt`	`str`	System prompt controlling paraphrase behavior. Defaults to :data:`_DEFAULT_SYSTEM_PROMPT`.	`_DEFAULT_SYSTEM_PROMPT`
`temperature`	`float`	Sampling temperature. Lower values are more conservative. Defaults to `0.3`.	`0.3`
`max_tokens`	`int`	Maximum response length in tokens. Defaults to `200`.	`200`

`realize(anchor: SemanticAnchor, context: ProtocolContext) -> str` ¶

Generate a context-adapted question via the LM.

When a cache was supplied at construction time and a cached result exists for the same prompt, the cached value is returned without calling the LM.

Parameters:

Name	Type	Description	Default
`anchor`	`SemanticAnchor`	Semantic specification.	required
`context`	`ProtocolContext`	Current annotation context.	required

Returns:

Type	Description
`str`	LM-generated prompt string. Surrounding quotes and whitespace are stripped, and a trailing `?` is appended when missing.

Raises:

Type	Description
`RuntimeError`	If the LM backend raises, or if the LM returns an empty response.

Drift Validation¶

`drift` ¶

Drift validation: the type-checker for realized prompts.

A :class:DriftGuard verifies that a realized prompt still inhabits the type defined by its :class:~bead.protocol.anchor.SemanticAnchor. Without drift control an LM paraphraser, or even a rule-based selector, may produce prompts that subtly change what is being measured.

Three validators are provided:

:class:StructuralDriftValidator checks that required span references and keywords appear in the realization and that the question is well-formed.
:class:EmbeddingDriftValidator checks that the embedding of the realized prompt is within a configured cosine distance of the anchor's canonical-prompt embedding.
:class:PerplexityDriftValidator flags realizations whose language- model perplexity exceeds a configured ceiling.

These compose under a :class:DriftGuard, which runs all configured validators and aggregates their findings: a realization passes only when every validator passes.

`EmbeddingAdapter` ¶

Bases: Protocol

Structural type for objects that can embed text.

Conforms to bead :class:~bead.items.adapters.ModelAdapter and to any other object exposing a get_embedding method that returns a sequence of floats.

Examples:

>>> class StubEmbedder:
...     def get_embedding(self, text: str) -> Sequence[float]:
...         return (1.0, 0.0)
>>> isinstance(StubEmbedder(), EmbeddingAdapter)
True

`get_embedding(text: str) -> Sequence[float]` ¶

Embed text to a fixed-length sequence of floats.

Parameters:

Name	Type	Description	Default
`text`	`str`	Text to embed.	required

Returns:

Type	Description
`Sequence[float]`	Embedding vector, treated as a flat sequence of floats.

`PerplexityAdapter` ¶

Bases: Protocol

Structural type for objects that can score text perplexity.

Conforms to bead :class:~bead.items.adapters.ModelAdapter and to any other object exposing a compute_perplexity method.

`compute_perplexity(text: str) -> float` ¶

Compute the perplexity of text under the backend.

Parameters:

Name	Type	Description	Default
`text`	`str`	Text to score.	required

Returns:

Type	Description
`float`	Perplexity in the open interval `(0, +inf)`.

`DriftScore` ¶

Bases: BeadBaseModel

Result of one or more drift validation checks.

Attributes:

Name	Type	Description
`passed`	`bool`	Whether the realization passes the validators that produced this score. Defaults to `True`.
`structural_ok`	`bool`	Whether structural constraints are satisfied. Defaults to `True`.
`embedding_distance`	`float \| None`	Cosine distance from the canonical-prompt embedding, if an embedding validator ran. Defaults to `None`.
`perplexity`	`float \| None`	Perplexity of the realized prompt under the validating language model, if a perplexity validator ran. Defaults to `None`.
`findings`	`tuple[str, ...]`	Human-readable descriptions of any issues found. Defaults to the empty tuple.

`DriftValidator` ¶

Bases: Protocol

Protocol for a single drift-validation check.

Examples:

A minimal conforming validator:

>>> class AlwaysPasses:
...     def validate(self, realization, anchor, context):
...         return DriftScore(passed=True)
>>> isinstance(AlwaysPasses(), DriftValidator)
True

`validate(realization: str, anchor: SemanticAnchor, context: ProtocolContext) -> DriftScore` ¶

Check the realization against the anchor.

Parameters:

Name	Type	Description	Default
`realization`	`str`	The realized prompt string.	required
`anchor`	`SemanticAnchor`	The semantic specification.	required
`context`	`ProtocolContext`	The annotation context.	required

Returns:

Type	Description
`DriftScore`	Validation result.

`StructuralDriftValidator` `dataclass` ¶

Validate structural properties of a realized prompt.

Checks that:

All required span labels appear as [[label]] references.
Required keywords appear somewhere in the prompt.
The prompt ends with appropriate punctuation.
The prompt is not trivially short.

Parameters:

Name	Type	Description	Default
`min_length`	`int`	Minimum non-whitespace character length for a valid prompt. Defaults to `15`.	`15`
`require_question_mark`	`bool`	Whether the realization must end with `?`. Defaults to `True`.	`True`
`keyword_case_sensitive`	`bool`	Whether keyword checks are case-sensitive. Defaults to `False`.	`False`

Attributes:

Name	Type	Description
`min_length`	`int`	Minimum prompt length.
`require_question_mark`	`bool`	Whether the trailing `?` is required.
`keyword_case_sensitive`	`bool`	Whether keyword matches are case-sensitive.

`validate(realization: str, anchor: SemanticAnchor, context: ProtocolContext) -> DriftScore` ¶

Run the structural checks against a realization.

Parameters:

Name	Type	Description	Default
`realization`	`str`	The realized prompt string.	required
`anchor`	`SemanticAnchor`	The semantic specification supplying required labels and keywords.	required
`context`	`ProtocolContext`	The annotation context (unused by this validator but required by the :class:`DriftValidator` protocol).	required

Returns:

Type	Description
`DriftScore`	Score with `structural_ok` set and any failures listed in `findings`.

`EmbeddingDriftValidator` ¶

Validate that a realization is semantically close to the anchor.

Computes cosine distance between the realization embedding and the anchor's canonical-prompt embedding (either pre-computed in the anchor or computed on demand from the canonical prompt). The realization passes when the distance is at most the configured maximum (or the anchor's :attr:~SemanticAnchor.max_drift, if no explicit maximum is set).

Embeddings are obtained from any object conforming to the :class:EmbeddingAdapter Protocol, which includes the bead :class:~bead.items.adapters.ModelAdapter family.

Parameters:

Name	Type	Description	Default
`adapter`	`EmbeddingAdapter`	Adapter exposing `get_embedding(text)`. The returned sequence is treated as a flat vector and converted to a `tuple[float, ...]`.	required
`max_distance`	`float \| None`	Override for the anchor's `max_drift` value. Defaults to `None` (use the anchor's own value).	`None`

Attributes:

Name	Type	Description
`max_distance`	`float \| None`	Configured override, or `None` to defer to the anchor.

`validate(realization: str, anchor: SemanticAnchor, context: ProtocolContext) -> DriftScore` ¶

Score the realization by cosine distance from the anchor.

Parameters:

Name	Type	Description	Default
`realization`	`str`	The realized prompt string.	required
`anchor`	`SemanticAnchor`	The semantic specification supplying the canonical prompt (and optionally a pre-computed embedding center and a `max_drift` value).	required
`context`	`ProtocolContext`	The annotation context (unused by this validator but required by the :class:`DriftValidator` protocol).	required

Returns:

Type	Description
`DriftScore`	Score with `embedding_distance` set; `passed` is `True` iff the distance is within the configured maximum.

`PerplexityDriftValidator` ¶

Validate that a realization has acceptable language-model perplexity.

Wraps any object conforming to the :class:PerplexityAdapter Protocol (which includes the bead :class:~bead.items.adapters.ModelAdapter family). The realization passes when its perplexity is at most the configured ceiling. Useful for catching ungrammatical or otherwise unnatural LM-generated paraphrases that might still pass structural and embedding checks.

Parameters:

Name	Type	Description	Default
`adapter`	`PerplexityAdapter`	Adapter exposing `compute_perplexity(text) -> float`.	required
`max_perplexity`	`float`	Maximum allowed perplexity. Realizations with perplexity above this value fail.	required

Attributes:

Name	Type	Description
`max_perplexity`	`float`	The configured perplexity ceiling.

`validate(realization: str, anchor: SemanticAnchor, context: ProtocolContext) -> DriftScore` ¶

Score the realization by language-model perplexity.

Parameters:

Name	Type	Description	Default
`realization`	`str`	The realized prompt string.	required
`anchor`	`SemanticAnchor`	The semantic specification (unused by this validator).	required
`context`	`ProtocolContext`	The annotation context (unused by this validator).	required

Returns:

Type	Description
`DriftScore`	Score with `perplexity` set; `passed` is `True` iff `perplexity <= max_perplexity`.

`DriftGuard` `dataclass` ¶

Composite drift validator.

Runs every configured validator and aggregates their results: the aggregate :class:DriftScore passed field is True only when every validator passes. Findings from all validators are collected in order. embedding_distance and perplexity are populated from the last validator that set them.

Attributes:

Name	Type	Description
`validators`	`list[DriftValidator]`	Mutable list of configured validators. Defaults to the empty list; calls to :meth:`check` on a guard with no validators always pass.

`add(validator: DriftValidator) -> None` ¶

Append a validator to the guard.

Parameters:

Name	Type	Description	Default
`validator`	`DriftValidator`	The validator to add.	required

`check(realization: str, anchor: SemanticAnchor, context: ProtocolContext) -> DriftScore` ¶

Run every validator and return an aggregated score.

Parameters:

Name	Type	Description	Default
`realization`	`str`	The realized prompt string.	required
`anchor`	`SemanticAnchor`	The semantic specification.	required
`context`	`ProtocolContext`	The annotation context.	required

Returns:

Type	Description
`DriftScore`	Aggregate score. `passed` is `True` iff every validator passes; `findings` concatenates all validator-level findings; `embedding_distance` and `perplexity` are taken from the validators that set them.

`len() -> int` ¶

Return the number of configured validators.

Question Families and Protocols¶

`family` ¶

Question families and annotation protocols.

A :class:QuestionFamily is a dependent function Pi(ctx : ProtocolContext). Question(ctx): for each context it produces a valid, drift-checked :class:~bead.protocol.family.QuestionRealization.

An :class:AnnotationProtocol is the iterated dependent product

Sigma(a_1 : Q_1(ctx)). Sigma(a_2 : Q_2(ctx, a_1)). ... Q_n(ctx, ...)

a sequence of question families where later families may condition on the responses to earlier ones. The dependency edges between families are recorded explicitly in :attr:QuestionFamily.depends_on, which :class:~bead.protocol.diagnostics.ConditionalObservationValidator consults to check the integrity of conditional responses.

`ApplicabilityPredicate = Callable[[ProtocolContext], bool]` `module-attribute` ¶

Type alias for predicates determining when a family applies.

`QuestionRealization` ¶

Bases: BeadBaseModel

A realized question paired with its provenance.

This is the dependent pair Sigma(ctx). Question(ctx): a concrete prompt together with the context that produced it and evidence of its validity (a :class:~bead.protocol.drift.DriftScore).

Attributes:

Name	Type	Description
`prompt`	`str`	The realized prompt string. May contain `[[label]]` references for downstream rendering.
`anchor`	`SemanticAnchor`	The semantic specification this question satisfies.
`context`	`ProtocolContext`	The context that parameterized the realization.
`drift_score`	`DriftScore \| None`	Result of drift validation, if a guard was applied. Defaults to `None`.
`strategy_name`	`str`	Name of the realization strategy that produced this question. Defaults to the empty string.

`passed_drift_check: bool` `property` ¶

Whether the realization passed drift validation.

True when no drift score is attached (no validation was run) or when the attached score's passed flag is True.

`QuestionFamily` `dataclass` ¶

Dependent function from contexts to realized questions.

For each :class:ProtocolContext, a family produces a :class:QuestionRealization by:

Checking applicability (is this question relevant for this context?).
Invoking the realization strategy to produce a prompt.
Running drift validation, if a guard is configured.
Falling back to the canonical prompt if the realization drifts (when fallback_on_drift is enabled).

Parameters:

Name	Type	Description	Default
`anchor`	`SemanticAnchor`	The semantic type of questions this family produces.	required
`realization`	`RealizationStrategy \| None`	Strategy producing a prompt for a given context. Defaults to an unparameterized :class:`TemplateRealization`, which echoes the anchor's canonical prompt.	`TemplateRealization()`
`drift_guard`	`DriftGuard \| None`	Optional drift validator. Defaults to `None`.	`None`
`condition`	`ApplicabilityPredicate \| None`	When to ask this question. `None` (the default) marks the family as always applicable; any non-`None` value sets :attr:`is_always_applicable` to `False`.	`_always_applicable`
`depends_on`	`tuple[str, ...]`	Anchor names whose responses must precede this family in a protocol. Read by :class:`~bead.protocol.diagnostics.ConditionalObservationValidator`. Defaults to the empty tuple.	`()`
`fallback_on_drift`	`bool`	If `True` (the default), fall back to the canonical prompt when drift validation fails. If `False`, raise :class:`ValueError`.	`True`

Attributes:

Name	Type	Description
`anchor`	`SemanticAnchor`	Configured anchor.
`realization`	`RealizationStrategy`	Configured realization strategy.
`drift_guard`	`DriftGuard \| None`	Configured drift guard.
`condition`	`ApplicabilityPredicate`	Configured applicability predicate.
`depends_on`	`tuple[str, ...]`	Names of anchors this family depends on.
`fallback_on_drift`	`bool`	Whether to fall back on drift failure.

`name: str` `property` ¶

Short name from the anchor.

`__post_init__() -> None` ¶

Record whether condition is the default predicate.

`is_applicable(context: ProtocolContext) -> bool` ¶

Whether this family should be asked for the given context.

Parameters:

Name	Type	Description	Default
`context`	`ProtocolContext`	Current annotation context.	required

Returns:

Type	Description
`bool`	`True` when the family applies.

`realize(context: ProtocolContext) -> QuestionRealization` ¶

Produce a question for the given context.

Parameters:

Name	Type	Description	Default
`context`	`ProtocolContext`	Current annotation context.	required

Returns:

Type	Description
`QuestionRealization`	The realized question with its drift score and provenance.

Raises:

Type	Description
`ValueError`	If drift validation fails and :attr:`fallback_on_drift` is `False`.

`AnnotationProtocol` `dataclass` ¶

A sequence of question families forming a complete protocol.

Represents the iterated dependent product

Sigma(a_1 : Q_1(ctx)). Sigma(a_2 : Q_2(ctx, a_1)). ...

When realized, the protocol threads annotator responses through the context so later families can condition on earlier answers.

Parameters:

Name	Type	Description	Default
`families`	`list[QuestionFamily]`	Families in protocol order.	required
`name`	`str`	Descriptive name for the protocol. Defaults to the empty string.	`''`

Attributes:

Name	Type	Description
`families`	`list[QuestionFamily]`	Families in protocol order.
`name`	`str`	Descriptive name.

Raises:

Type	Description
`ValueError`	If two families share the same anchor name (anchor names must be unique within a protocol), or if any family's :attr:`~QuestionFamily.depends_on` references a family that does not appear earlier in the sequence.

`__post_init__() -> None` ¶

Validate uniqueness and forward-only depends_on edges.

`append(family: QuestionFamily) -> None` ¶

Add a family to the end of the protocol.

Parameters:

Name	Type	Description	Default
`family`	`QuestionFamily`	The family to append.	required

Raises:

Type	Description
`ValueError`	If a family with the same anchor name is already present, or if any of its :attr:`~QuestionFamily.depends_on` references a family not already in the protocol.

`family_by_name(name: str) -> QuestionFamily` ¶

Look up a family by its anchor name.

Parameters:

Name	Type	Description	Default
`name`	`str`	The anchor name to look up.	required

Returns:

Type	Description
`QuestionFamily`	The matching family.

Raises:

Type	Description
`KeyError`	If no family with that name exists in the protocol.

`realize_all(context: ProtocolContext, *, responses: dict[str, str] | None = None) -> list[QuestionRealization]` ¶

Realize all applicable families for a context.

Threads responses through the context as the protocol is traversed. When responses is provided it is injected before any family is realized; otherwise, after each family is realized, the first option of its response space is used as a placeholder so downstream families can be exercised in dry-run mode.

Parameters:

Name	Type	Description	Default
`context`	`ProtocolContext`	Base annotation context.	required
`responses`	`dict[str, str] \| None`	Pre-supplied responses keyed by anchor name. Defaults to `None`.	`None`

Returns:

Type	Description
`list[QuestionRealization]`	Realized questions in protocol order, skipping families whose :meth:`QuestionFamily.is_applicable` returns `False` for the running context.

Raises:

Type	Description
`ValueError`	If `responses` references an anchor not in the protocol.

`len() -> int` ¶

Return the number of families in the protocol.

Diagnostics¶

`diagnostics` ¶

Dataset diagnostics and quality reporting for annotation protocols.

Provides :class:DatasetReport, a structured immutable summary of quality issues discovered during dataset preparation, and :class:ConditionalObservationValidator, which checks that responses to conditional questions respect the protocol's :attr:~bead.protocol.family.QuestionFamily.depends_on graph.

Diagnostic findings are immutable :class:DiagnosticRecord instances collected in order of discovery. The :meth:DatasetReport.summary method produces a human-readable overview suitable for logging.

`DiagnosticLevel` ¶

Bases: StrEnum

Severity of a diagnostic finding.

Attributes:

Name	Type	Description
`INFO`	`str`	Informational message. Wire value: `"info"`.
`WARNING`	`str`	Warning that does not prevent dataset use. Wire value: `"warning"`.
`ERROR`	`str`	Error that may invalidate downstream analysis. Wire value: `"error"`.

`DiagnosticRecord` ¶

Bases: BeadBaseModel

A single diagnostic finding.

Attributes:

Name	Type	Description
`level`	`DiagnosticLevel`	Severity of the finding.
`category`	`str`	Short category tag (for example `"missing_embedding"` or `"unrecognized_label"`).
`message`	`str`	Human-readable description.
`item_id`	`str \| None`	The item this finding pertains to, if applicable. Defaults to `None`.
`question_name`	`str \| None`	The anchor name this finding pertains to, if applicable. Defaults to `None`.

`DatasetReport` ¶

Bases: BeadBaseModel

Immutable structured report of dataset-preparation quality.

Mutating methods (:meth:add, :meth:with_coverage, :meth:with_missing_embedding) follow the bead convention of returning a new instance via .with_(...); the original is unchanged.

Attributes:

Name	Type	Description
`n_records_input`	`int`	Total number of input records received. Defaults to `0`.
`n_items`	`int`	Number of unique item ids. Defaults to `0`.
`n_records_encoded`	`int`	Number of records successfully encoded. Defaults to `0`.
`n_records_dropped`	`int`	Number of records dropped. Defaults to `0`.
`coverage`	`dict[str, float]`	Per-question response-coverage rate (fraction of items with a valid response). Defaults to the empty dict.
`findings`	`tuple[DiagnosticRecord, ...]`	All diagnostic findings, in order of discovery. Defaults to the empty tuple.
`items_missing_embeddings`	`tuple[str, ...]`	Item ids that had no embedding provided. Defaults to the empty tuple.

`has_warnings: bool` `property` ¶

Whether any warning-level findings exist.

`has_errors: bool` `property` ¶

Whether any error-level findings exist.

`warnings: tuple[DiagnosticRecord, ...]` `property` ¶

All warning-level findings, in discovery order.

`errors: tuple[DiagnosticRecord, ...]` `property` ¶

All error-level findings, in discovery order.

`add(level: DiagnosticLevel, category: str, message: str, *, item_id: str | None = None, question_name: str | None = None) -> Self` ¶

Return a new report with one additional finding appended.

Parameters:

Name	Type	Description	Default
`level`	`DiagnosticLevel`	Severity.	required
`category`	`str`	Category tag.	required
`message`	`str`	Description.	required
`item_id`	`str \| None`	Related item id. Defaults to `None`.	`None`
`question_name`	`str \| None`	Related anchor name. Defaults to `None`.	`None`

Returns:

Type	Description
`DatasetReport`	New report with the finding added.

`extend(records: Sequence[DiagnosticRecord]) -> Self` ¶

Return a new report with multiple findings appended.

Parameters:

Name	Type	Description	Default
`records`	`Sequence[DiagnosticRecord]`	Findings to append.	required

Returns:

Type	Description
`DatasetReport`	New report with the findings added.

`with_coverage(question_name: str, rate: float) -> Self` ¶

Return a new report with one coverage entry set.

Parameters:

Name	Type	Description	Default
`question_name`	`str`	Anchor name.	required
`rate`	`float`	Coverage rate in `[0.0, 1.0]`.	required

Returns:

Type	Description
`DatasetReport`	New report with the entry set or replaced.

`with_missing_embedding(item_id: str) -> Self` ¶

Return a new report flagging one item as missing an embedding.

If item_id is already flagged the report is returned unchanged (the missing-embedding list is a set semantically).

Parameters:

Name	Type	Description	Default
`item_id`	`str`	The item id that lacked an embedding.	required

Returns:

Type	Description
`DatasetReport`	New report with the item recorded.

`by_category(category: str) -> tuple[DiagnosticRecord, ...]` ¶

Filter findings by category tag.

Parameters:

Name	Type	Description	Default
`category`	`str`	Category tag to filter on.	required

Returns:

Type	Description
`tuple[DiagnosticRecord, ...]`	Matching findings, in discovery order.

`summary() -> str` ¶

Produce a human-readable multi-line summary.

Returns:

Type	Description
`str`	A summary string suitable for logging.

`RecordLike` ¶

Bases: Protocol

Structural type for records consumed by the validator.

Any object with the three attributes below conforms. The bead :class:~bead.evaluation.reliability.AnnotationRecord is a canonical example.

Attributes:

Name	Type	Description
`item_id`	`str`	Identifier of the annotation item.
`response_label`	`str`	Annotator's response label.
`question_name`	`str`	Anchor name of the question being answered.

`ConditionalObservationValidator` `dataclass` ¶

Verify that conditional responses respect protocol dependencies.

For every family in a protocol with non-empty :attr:~bead.protocol.family.QuestionFamily.depends_on, the validator checks two things:

Dependency presence: each item with a response on the conditional question must also have a response on every upstream question.
Dependency value (optional): when conditioning_values is supplied for the conditional anchor, the upstream response must be one of the allowed labels.

Findings are emitted as :class:DiagnosticRecord instances at the :attr:DiagnosticLevel.WARNING level.

Parameters:

Name	Type	Description	Default
`conditioning_values`	`Mapping[str, set[str]] \| None`	Per-conditional-anchor mapping from upstream label set to validity. When omitted the validator only checks dependency presence. Defaults to `None`.	`dict()`

Attributes:

Name	Type	Description
`conditioning_values`	`Mapping[str, set[str]]`	Conditioning-value table (immutable view).

`validate(records_by_question: Mapping[str, Sequence[RecordLike]], protocol: AnnotationProtocol) -> tuple[DiagnosticRecord, ...]` ¶

Check conditional-observation consistency for a protocol.

Parameters:

Name	Type	Description	Default
`records_by_question`	`Mapping[str, Sequence[record - like]]`	Records grouped by anchor name. Each record must expose `item_id`, `response_label`, and `question_name` attributes.	required
`protocol`	`AnnotationProtocol`	The protocol whose dependency edges drive the validation.	required

Returns:

Type	Description
`tuple[DiagnosticRecord, ...]`	Warning-level findings for any inconsistencies detected.

Item-Layer Bridge¶

Single canonical bridge from a realized question to a fully-populated :class:~bead.items.item.Item and from a configured protocol to the per-family :class:~bead.items.item_template.ItemTemplate collection.

`items` ¶

Bridge from the protocol layer to bead's item-construction layer.

A :class:~bead.protocol.QuestionFamily declares the type-level shape of a question; a :class:~bead.protocol.QuestionRealization is one realization of that question for a particular :class:~bead.protocol.ProtocolContext. To deploy realizations through bead's experimental pipeline they must be packaged as :class:~bead.items.item_template.ItemTemplate and :class:~bead.items.item.Item instances.

This module is the canonical bridge. It defines two mappings:

:func:scale_type_to_task_type — the single canonical translation from :class:~bead.protocol.ScaleType to the :class:~bead.items.item_template.TaskType literal used by item templates and active-learning model selection.
:func:family_to_item_template — build the per-family :class:ItemTemplate (one template per anchor; the same template is reused for every realization of that family).
:func:realization_to_item — package a single :class:QuestionRealization as an :class:Item bound to the family's template, with sentence text and span metadata derived from the realization's :class:ProtocolContext.
:func:protocol_to_item_templates — return a name-keyed dict of templates for an entire protocol.

The mapping is total: every supported :class:ScaleType corresponds to exactly one :class:TaskType, and every protocol family produces exactly one :class:ItemTemplate. There is no per-task-type factory in the protocol layer; the family + realization pair is the single canonical way to build items for a protocol.

`scale_type_to_task_type(scale_type: ScaleType) -> TaskType` ¶

Translate a :class:ScaleType to its :class:TaskType.

This is the single canonical mapping used by every part of bead that bridges between protocol-layer encodings and item-layer task types (item construction, active-learning model selection, jsPsych deployment).

Parameters:

Name	Type	Description	Default
`scale_type`	`ScaleType`	Protocol-layer scale type.	required

Returns:

Type	Description
`TaskType`	The matching :class:`TaskType` literal.

Examples:

>>> from bead.protocol.encoding import ScaleType
>>> scale_type_to_task_type(ScaleType.ORDINAL)
'ordinal_scale'

`family_to_item_template(family: QuestionFamily, *, judgment_type: JudgmentType, presentation_spec: PresentationSpec | None = None) -> ItemTemplate` ¶

Build the :class:ItemTemplate for a :class:QuestionFamily.

The template's task_type is derived from the anchor's response space via :func:scale_type_to_task_type. Ordinal scales populate :attr:TaskSpec.scale_bounds (0 to n_levels - 1) and :attr:TaskSpec.scale_labels (one :class:ScalePointLabel per option). Binary and nominal scales populate :attr:TaskSpec.options with the anchor's labels. Forced-choice scales leave :attr:TaskSpec.options unset (the per-item alternatives live on each :class:Item rather than on the template); the anchor's labels remain accessible via family.anchor.response_space.options.

The prompt field of the template's :class:TaskSpec is the anchor's canonical prompt (with [[label]] references intact); individual realizations override the prompt at item-construction time via the prompt rendered-element on the resulting :class:Item.

Parameters:

Name	Type	Description	Default
`family`	`QuestionFamily`	The family to bridge.	required
`judgment_type`	`JudgmentType`	Semantic property being measured (caller-supplied because bead's :class:`JudgmentType` taxonomy is broader than :class:`~bead.protocol.encoding.ScaleType`).	required
`presentation_spec`	`PresentationSpec \| None`	Custom presentation spec. Defaults to a fresh :class:`PresentationSpec` with mode `"static"`.	`None`

Returns:

Type	Description
`ItemTemplate`	Template with `name` set to the anchor name, `task_type` derived from the scale, and `elements` covering `"text"` (the sentence) and `"prompt"` (the realized question).

`realization_to_item(realization: QuestionRealization, *, item_template: ItemTemplate) -> Item` ¶

Package a :class:QuestionRealization as an :class:Item.

The resulting :class:Item references item_template by id, rendering the realization's prompt as the "prompt" element and the context's sentence as the "text" element. Tokenized elements are populated from :attr:ProtocolContext.tokens when present; spans for the anchor's required_span_labels are derived via :func:_spans_from_context.

Parameters:

Name	Type	Description	Default
`realization`	`QuestionRealization`	A realized question produced by :meth:`QuestionFamily.realize`.	required
`item_template`	`ItemTemplate`	The template returned by :func:`family_to_item_template` for the originating family. The bridge does not validate that the template was produced from the same family — the caller is responsible for matching them.	required

Returns:

Type	Description
`Item`	Item bound to the template, with the realization's prompt and context materialized.

`protocol_to_item_templates(protocol: AnnotationProtocol, *, judgment_type: JudgmentType, presentation_spec: PresentationSpec | None = None) -> dict[str, ItemTemplate]` ¶

Build one :class:ItemTemplate per family in the protocol.

Parameters:

Name	Type	Description	Default
`protocol`	`AnnotationProtocol`	The protocol whose families to translate.	required
`judgment_type`	`JudgmentType`	Common judgment type to assign to every template.	required
`presentation_spec`	`PresentationSpec \| None`	Common presentation spec; defaults to a fresh one per call.	`None`

Returns:

Type	Description
`dict[str, ItemTemplate]`	Mapping from family / anchor name to its :class:`ItemTemplate`.

`realize_protocol_to_items(protocol: AnnotationProtocol, context: ProtocolContext, *, judgment_type: JudgmentType, item_templates: dict[str, ItemTemplate] | None = None, responses: dict[str, str] | None = None, presentation_spec: PresentationSpec | None = None) -> tuple[tuple[QuestionRealization, Item], ...]` ¶

Realize a protocol against one context, packaging items.

Each applicable family is realized in protocol order; each :class:QuestionRealization is paired with the :class:Item produced by :func:realization_to_item.

Parameters:

Name	Type	Description	Default
`protocol`	`AnnotationProtocol`	Protocol to realize.	required
`context`	`ProtocolContext`	Base context for realization.	required
`judgment_type`	`JudgmentType`	Judgment type assigned to every template.	required
`item_templates`	`dict[str, ItemTemplate] \| None`	Pre-built templates; built via :func:`protocol_to_item_templates` when `None`.	`None`
`responses`	`dict[str, str] \| None`	Pre-supplied responses threaded into the context. Defaults to `None`.	`None`
`presentation_spec`	`PresentationSpec \| None`	Common presentation spec when templates are built fresh.	`None`

Returns:

Type	Description
`tuple[tuple[QuestionRealization, Item], ...]`	For each applicable family, the `(realization, item)` pair in protocol order.

Keys	Action
`?`	Open this help
`n`	Next page
`p`	Previous page
`s`	Search

bead.protocol¶

Anchors and Response Spaces¶

anchor ¶

ScaleType ¶

SemanticPoles ¶

as_tuple() -> tuple[str, str] ¶

ResponseSpace ¶

__len__() -> int ¶

__contains__(item: str) -> bool ¶

SemanticAnchor ¶

Encoding¶

encoding ¶

ScaleType ¶

ResponseEncoding ¶

is_ordinal: bool property ¶

is_binary: bool property ¶

is_nominal: bool property ¶

is_forced_choice: bool property ¶

label_to_index(label: str) -> int ¶

index_to_label(index: int) -> str ¶

encode_response_space(name: str, response_space: ResponseSpace, *, scale_type: ScaleType | None = None) -> ResponseEncoding ¶

Contexts¶

context ¶

ContextPredicate = Callable[[ProtocolContext], bool] module-attribute ¶

ContextItem ¶

attribute(name: str) -> float | None ¶

ProtocolContext ¶

with_response(question_name: str, response: str) -> Self ¶

get_response(question_name: str) -> str | None ¶

register_context_predicate(name: str, predicate: ContextPredicate) -> None ¶

get_context_predicate(name: str) -> ContextPredicate ¶

list_context_predicates() -> tuple[str, ...] ¶

always(_ctx: ProtocolContext) -> bool ¶

Realization Strategies¶

realization ¶

RealizationStrategy ¶

realize(anchor: SemanticAnchor, context: ProtocolContext) -> str ¶

TemplateVariant dataclass ¶

TemplateRealization dataclass ¶

realize(anchor: SemanticAnchor, context: ProtocolContext) -> str ¶

ContextualTemplateRealization dataclass ¶

__post_init__() -> None ¶

realize(anchor: SemanticAnchor, context: ProtocolContext) -> str ¶

LMClient ¶

complete(prompt: str, *, temperature: float, max_tokens: int) -> str ¶

LMRealization ¶

realize(anchor: SemanticAnchor, context: ProtocolContext) -> str ¶

Drift Validation¶

drift ¶

EmbeddingAdapter ¶

get_embedding(text: str) -> Sequence[float] ¶

PerplexityAdapter ¶

compute_perplexity(text: str) -> float ¶

DriftScore ¶

DriftValidator ¶

validate(realization: str, anchor: SemanticAnchor, context: ProtocolContext) -> DriftScore ¶

StructuralDriftValidator dataclass ¶

validate(realization: str, anchor: SemanticAnchor, context: ProtocolContext) -> DriftScore ¶

EmbeddingDriftValidator ¶

validate(realization: str, anchor: SemanticAnchor, context: ProtocolContext) -> DriftScore ¶

PerplexityDriftValidator ¶

validate(realization: str, anchor: SemanticAnchor, context: ProtocolContext) -> DriftScore ¶

DriftGuard dataclass ¶

add(validator: DriftValidator) -> None ¶

check(realization: str, anchor: SemanticAnchor, context: ProtocolContext) -> DriftScore ¶

__len__() -> int ¶

Question Families and Protocols¶

family ¶

ApplicabilityPredicate = Callable[[ProtocolContext], bool] module-attribute ¶

QuestionRealization ¶

passed_drift_check: bool property ¶

QuestionFamily dataclass ¶

name: str property ¶

__post_init__() -> None ¶

is_applicable(context: ProtocolContext) -> bool ¶

realize(context: ProtocolContext) -> QuestionRealization ¶

AnnotationProtocol dataclass ¶

__post_init__() -> None ¶

append(family: QuestionFamily) -> None ¶

family_by_name(name: str) -> QuestionFamily ¶

`anchor` ¶

`ScaleType` ¶

`SemanticPoles` ¶

`as_tuple() -> tuple[str, str]` ¶

`ResponseSpace` ¶

`len() -> int` ¶

`contains(item: str) -> bool` ¶

`SemanticAnchor` ¶

`encoding` ¶

`ScaleType` ¶

`ResponseEncoding` ¶

`is_ordinal: bool` `property` ¶

`is_binary: bool` `property` ¶

`is_nominal: bool` `property` ¶

`is_forced_choice: bool` `property` ¶

`label_to_index(label: str) -> int` ¶

`index_to_label(index: int) -> str` ¶

`encode_response_space(name: str, response_space: ResponseSpace, *, scale_type: ScaleType | None = None) -> ResponseEncoding` ¶

`context` ¶

`ContextPredicate = Callable[[ProtocolContext], bool]` `module-attribute` ¶

`ContextItem` ¶

`attribute(name: str) -> float | None` ¶

`ProtocolContext` ¶

`with_response(question_name: str, response: str) -> Self` ¶

`get_response(question_name: str) -> str | None` ¶

`register_context_predicate(name: str, predicate: ContextPredicate) -> None` ¶

`get_context_predicate(name: str) -> ContextPredicate` ¶

`list_context_predicates() -> tuple[str, ...]` ¶

`always(_ctx: ProtocolContext) -> bool` ¶

`realization` ¶

`RealizationStrategy` ¶

`realize(anchor: SemanticAnchor, context: ProtocolContext) -> str` ¶

`TemplateVariant` `dataclass` ¶

`TemplateRealization` `dataclass` ¶

`realize(anchor: SemanticAnchor, context: ProtocolContext) -> str` ¶

`ContextualTemplateRealization` `dataclass` ¶

`__post_init__() -> None` ¶

`realize(anchor: SemanticAnchor, context: ProtocolContext) -> str` ¶

`LMClient` ¶

`complete(prompt: str, *, temperature: float, max_tokens: int) -> str` ¶

`LMRealization` ¶

`realize(anchor: SemanticAnchor, context: ProtocolContext) -> str` ¶

`drift` ¶

`EmbeddingAdapter` ¶

`get_embedding(text: str) -> Sequence[float]` ¶

`PerplexityAdapter` ¶

`compute_perplexity(text: str) -> float` ¶

`DriftScore` ¶

`DriftValidator` ¶

`validate(realization: str, anchor: SemanticAnchor, context: ProtocolContext) -> DriftScore` ¶

`StructuralDriftValidator` `dataclass` ¶

`validate(realization: str, anchor: SemanticAnchor, context: ProtocolContext) -> DriftScore` ¶

`EmbeddingDriftValidator` ¶

`validate(realization: str, anchor: SemanticAnchor, context: ProtocolContext) -> DriftScore` ¶

`PerplexityDriftValidator` ¶

`validate(realization: str, anchor: SemanticAnchor, context: ProtocolContext) -> DriftScore` ¶

`DriftGuard` `dataclass` ¶

`add(validator: DriftValidator) -> None` ¶

`check(realization: str, anchor: SemanticAnchor, context: ProtocolContext) -> DriftScore` ¶

`len() -> int` ¶

`family` ¶

`ApplicabilityPredicate = Callable[[ProtocolContext], bool]` `module-attribute` ¶

`QuestionRealization` ¶

`passed_drift_check: bool` `property` ¶

`QuestionFamily` `dataclass` ¶

`name: str` `property` ¶

`__post_init__() -> None` ¶

`is_applicable(context: ProtocolContext) -> bool` ¶

`realize(context: ProtocolContext) -> QuestionRealization` ¶

`AnnotationProtocol` `dataclass` ¶

`__post_init__() -> None` ¶

`append(family: QuestionFamily) -> None` ¶

`family_by_name(name: str) -> QuestionFamily` ¶

`realize_all(context: ProtocolContext, *, responses: dict[str, str] | None = None) -> list[QuestionRealization]` ¶

`len() -> int` ¶

`diagnostics` ¶

`DiagnosticLevel` ¶

`DiagnosticRecord` ¶

`DatasetReport` ¶

`has_warnings: bool` `property` ¶