bead.items¶

Stage 3 of the bead pipeline: experimental item construction with 9 task types.

Core Classes¶

`item` ¶

Data models for constructed experimental items.

`UnfilledSlot` ¶

Bases: BeadBaseModel

An unfilled slot in a cloze task item.

Represents a slot in a partially filled template where the participant must provide a response. The UI widget for collecting the response is inferred from the slot's constraints at deployment time.

Attributes:

Name	Type	Description
`slot_name`	`str`	Name of the unfilled template slot.
`position`	`int`	Token index position in the rendered text.
`constraint_ids`	`list[UUID]`	UUIDs of constraints that apply to this slot.

Examples:

>>> from uuid import UUID
>>> # Extensional constraint slot (will render as dropdown)
>>> UnfilledSlot(
...     slot_name="determiner",
...     position=0,
...     constraint_ids=[UUID("12345678-1234-5678-1234-567812345678")]
... )
>>> # Unconstrained slot (will render as text input)
>>> UnfilledSlot(
...     slot_name="adjective",
...     position=2,
...     constraint_ids=[]
... )

`validate_slot_name(v: str) -> str` `classmethod` ¶

Validate slot name is not empty.

Parameters:

Name	Type	Description	Default
`v`	`str`	Slot name to validate.	required

Returns:

Type	Description
`str`	Validated slot name.

Raises:

Type	Description
`ValueError`	If slot name is empty or contains only whitespace.

`ModelOutput` ¶

Bases: BeadBaseModel

Output from a model computation.

Attributes:

Name	Type	Description
`model_name`	`str`	Name/identifier of the model.
`model_version`	`str`	Version of the model.
`operation`	`str`	Operation performed (e.g., "log_probability", "nli", "embedding").
`inputs`	`dict[str, MetadataValue]`	Inputs to the model.
`output`	`MetadataValue`	Model output.
`cache_key`	`str`	Cache key for this computation.
`computation_metadata`	`dict[str, MetadataValue]`	Metadata about the computation (timestamp, device, etc.).

Examples:

>>> output = ModelOutput(
...     model_name="gpt2",
...     model_version="latest",
...     operation="log_probability",
...     inputs={"text": "The cat broke the vase"},
...     output=-12.4,
...     cache_key="abc123..."
... )

`validate_non_empty_strings(v: str) -> str` `classmethod` ¶

Validate required string fields are not empty.

Parameters:

Name	Type	Description	Default
`v`	`str`	String value to validate.	required

Returns:

Type	Description
`str`	Validated string.

Raises:

Type	Description
`ValueError`	If string is empty or contains only whitespace.

`Item` ¶

Bases: BeadBaseModel

A constructed experimental item.

Items are discrete stimuli presented to participants or models for judgment collection. They are constructed from item templates and filled templates.

Attributes:

Name	Type	Description
`item_template_id`	`UUID`	UUID of the item template this was constructed from.
`filled_template_refs`	`list[UUID]`	UUIDs of filled templates used in this item.
`rendered_elements`	`dict[str, str]`	Rendered text for each element (by element_name).
`options`	`list[str]`	Choice options for forced_choice/multi_select tasks. Each string is one option text. Order matters (first option is displayed first).
`unfilled_slots`	`list[UnfilledSlot]`	Unfilled slots for cloze tasks (UI widgets inferred from constraints).
`model_outputs`	`list[ModelOutput]`	All model computations for this item.
`constraint_satisfaction`	`dict[UUID, bool]`	Constraint UUIDs mapped to satisfaction status.
`item_metadata`	`dict[str, MetadataValue]`	Additional metadata for this item.
`spans`	`list[Span]`	Span annotations for this item (default: empty).
`span_relations`	`list[SpanRelation]`	Relations between spans, directed or undirected (default: empty).
`tokenized_elements`	`dict[str, list[str]]`	Tokenized text for span indexing, keyed by element name (default: empty).
`token_space_after`	`dict[str, list[bool]]`	Per-token space_after flags for artifact-free rendering (default: empty).

Examples:

>>> # Simple item
>>> item = Item(
...     item_template_id=UUID("..."),
...     filled_template_refs=[UUID("...")],
...     rendered_elements={"sentence": "The cat broke the vase"}
... )
>>> # Forced-choice item with options
>>> fc_item = Item(
...     item_template_id=UUID("..."),
...     options=["The cat sat on the mat.", "The cats sat on the mat."],
...     item_metadata={"n_options": 2}
... )
>>> # Cloze item with unfilled slots
>>> cloze_item = Item(
...     item_template_id=UUID("..."),
...     rendered_elements={"sentence": "The ___ cat ___ the ___"},
...     unfilled_slots=[
...         UnfilledSlot(slot_name="determiner", position=0, constraint_ids=[...]),
...         UnfilledSlot(slot_name="verb", position=2, constraint_ids=[...])
...     ]
... )

`validate_span_relations() -> Item` ¶

Validate all span_relations reference valid span_ids from spans.

Returns:

Type	Description
`Item`	Validated item.

Raises:

Type	Description
`ValueError`	If a relation references a span_id not present in spans.

`get_model_output(model_name: str, operation: str, inputs: dict[str, MetadataValue] | None = None) -> ModelOutput | None` ¶

Get a specific model output.

Parameters:

Name	Type	Description	Default
`model_name`	`str`	Name of the model.	required
`operation`	`str`	Operation type.	required
`inputs`	`dict[str, MetadataValue] \| None`	Optional input filter.	`None`

Returns:

Type	Description
`ModelOutput \| None`	The model output if found, None otherwise.

Examples:

>>> output = item.get_model_output("gpt2", "log_probability")
>>> if output:
...     print(f"Log prob: {output.output}")

`add_model_output(output: ModelOutput) -> None` ¶

Add a model output to this item.

Parameters:

Name	Type	Description	Default
`output`	`ModelOutput`	Model output to add.	required

Examples:

>>> item.add_model_output(my_output)
>>> print(f"Item now has {len(item.model_outputs)} model outputs")

`ItemCollection` ¶

Bases: BeadBaseModel

A collection of constructed items.

Attributes:

Name	Type	Description
`name`	`str`	Name of this collection.
`source_template_collection_id`	`UUID`	UUID of the source item template collection.
`source_filled_collection_id`	`UUID`	UUID of the source filled template collection.
`items`	`list[Item]`	The constructed items.
`construction_stats`	`dict[str, int]`	Statistics about item construction.

Examples:

>>> collection = ItemCollection(
...     name="acceptability_items",
...     source_template_collection_id=UUID("..."),
...     source_filled_collection_id=UUID("...")
... )
>>> collection.add_item(item)

`validate_name(v: str) -> str` `classmethod` ¶

Validate collection name is not empty.

Parameters:

Name	Type	Description	Default
`v`	`str`	Collection name to validate.	required

Returns:

Type	Description
`str`	Validated collection name.

Raises:

Type	Description
`ValueError`	If name is empty or contains only whitespace.

`add_item(item: Item) -> None` ¶

Add an item to the collection.

Parameters:

Name	Type	Description	Default
`item`	`Item`	Item to add.	required

Examples:

>>> collection.add_item(my_item)
>>> print(f"Collection now has {len(collection.items)} items")

`item_template` ¶

Data models for experimental item templates.

`ChunkingSpec` ¶

Bases: BeadBaseModel

Specification for text segmentation in incremental presentation.

Defines how to segment text for self-paced reading or timed sequence presentation. Supports character-level, word-level, sentence-level, constituent-based (with parsing), or custom boundary segmentation.

Attributes:

Name	Type	Description
`unit`	`ChunkingUnit`	Segmentation unit type. Defaults to "word".
`parse_type`	`ParseType \| None`	Type of parsing for constituent chunking ("constituency" or "dependency").
`constituent_labels`	`list[str] \| None`	Labels for constituent chunking. For constituency parsing, these are constituent types (e.g., ["NP", "VP", "S"]). For dependency parsing, these are dependency relations (e.g., ["nsubj", "dobj", "root"]).
`parser`	`Literal['stanza', 'spacy'] \| None`	Parser library to use for constituent chunking.
`parse_language`	`str \| None`	ISO 639 language code for parser (e.g., "en", "es", "zh").
`custom_boundaries`	`list[int] \| None`	Token indices for custom chunking boundaries.

Examples:

>>> # Word-by-word chunking
>>> ChunkingSpec(unit="word")
>>> # Chunk by noun phrases (constituency)
>>> ChunkingSpec(
...     unit="constituent",
...     parse_type="constituency",
...     constituent_labels=["NP"],
...     parser="stanza",
...     parse_language="en"
... )
>>> # Chunk by subjects and objects (dependency)
>>> ChunkingSpec(
...     unit="constituent",
...     parse_type="dependency",
...     constituent_labels=["nsubj", "dobj"],
...     parser="spacy",
...     parse_language="en"
... )
>>> # Custom boundaries at specific token positions
>>> ChunkingSpec(unit="custom", custom_boundaries=[0, 3, 7, 10])

`TimingParams` ¶

Bases: BeadBaseModel

Timing parameters for stimulus presentation.

Defines timing constraints for timed sequence presentations, including per-chunk duration, inter-stimulus intervals, and response timeouts.

Attributes:

Name	Type	Description
`duration_ms`	`int \| None`	Duration in milliseconds to display each chunk (for timed sequences).
`isi_ms`	`int \| None`	Inter-stimulus interval in milliseconds between chunks.
`timeout_ms`	`int \| None`	Maximum time in milliseconds to wait for response.
`mask_char`	`str \| None`	Character to use for masking non-current chunks (e.g., "_").
`cumulative`	`bool`	If True, show all previous chunks; if False, show only current chunk.

Examples:

>>> # RSVP (Rapid Serial Visual Presentation)
>>> TimingParams(
...     duration_ms=250,
...     isi_ms=50,
...     cumulative=False,
...     mask_char="_"
... )
>>> # Self-paced with timeout
>>> TimingParams(timeout_ms=5000, cumulative=True)

`TaskSpec` ¶

Bases: BeadBaseModel

Parameters for the response collection task.

Specifies task-specific parameters like prompts, options, scale bounds, validation rules, etc. The appropriate parameters depend on the task_type specified in ItemTemplate. The task_type itself is not included here since it's part of the ItemTemplate structure.

Attributes:

Name	Type	Description
`prompt`	`str`	Question or instruction shown to participants.
`scale_bounds`	`tuple[int, int] \| None`	Min and max values for ordinal_scale task.
`scale_labels`	`dict[int, str] \| None`	Optional labels for specific scale points (ordinal_scale).
`options`	`list[str] \| None`	Available options for forced_choice, multi_select, or categorical tasks. For forced_choice/multi_select: element names to choose from. For categorical: category labels.
`min_selections`	`int \| None`	Minimum number of selections required (multi_select only).
`max_selections`	`int \| None`	Maximum number of selections allowed (multi_select only).
`text_validation_pattern`	`str \| None`	Regular expression pattern for validating free_text responses.
`max_length`	`int \| None`	Maximum character length for free_text responses.
`span_spec`	`SpanSpec \| None`	Span labeling specification (for span_labeling tasks or composite tasks with span overlays).

Examples:

>>> # Ordinal scale task (e.g., acceptability rating)
>>> TaskSpec(
...     prompt="How natural does this sentence sound?",
...     scale_bounds=(1, 7),
...     scale_labels={1: "Very unnatural", 7: "Very natural"}
... )
>>> # Categorical task (e.g., NLI)
>>> TaskSpec(
...     prompt="What is the relationship?",
...     options=["Entailment", "Neutral", "Contradiction"]
... )
>>> # Binary task
>>> TaskSpec(
...     prompt="Is this sentence grammatical?"
... )
>>> # Forced choice task (e.g., minimal pair)
>>> TaskSpec(
...     prompt="Which sounds more natural?",
...     options=["sentence_a", "sentence_b"]
... )
>>> # Multi-select task (e.g., select all grammatical)
>>> TaskSpec(
...     prompt="Select all grammatical sentences:",
...     options=["sent_a", "sent_b", "sent_c"],
...     min_selections=1
... )
>>> # Free text task
>>> TaskSpec(
...     prompt="Who performed the action?",
...     max_length=50
... )

`validate_prompt(v: str) -> str` `classmethod` ¶

Validate prompt is not empty.

Parameters:

Name	Type	Description	Default
`v`	`str`	Prompt to validate.	required

Returns:

Type	Description
`str`	Validated prompt.

Raises:

Type	Description
`ValueError`	If prompt is empty or contains only whitespace.

`PresentationSpec` ¶

Bases: BeadBaseModel

Specification of stimulus presentation method.

Defines how stimuli are displayed to participants (static, self-paced, or timed sequence), including segmentation and timing parameters. Separate from judgment specification to maintain clean separation of concerns.

Attributes:

Name	Type	Description
`mode`	`PresentationMode`	Presentation mode (static, self_paced, or timed_sequence). Defaults to "static".
`chunking`	`ChunkingSpec`	Chunking specification for incremental presentations. Defaults to word-level chunking.
`timing`	`TimingParams`	Timing parameters for timed presentations. Defaults to cumulative display with no fixed durations.
`display_format`	`dict[str, str \| int \| float \| bool]`	Additional display formatting options.
`tokenizer_config`	`TokenizerConfig \| None`	Display tokenizer configuration for span annotation. When set, controls how text is tokenized for span indexing and display.

Examples:

>>> # Static presentation (default)
>>> PresentationSpec()
>>> # Self-paced word-by-word reading
>>> PresentationSpec(
...     mode="self_paced",
...     chunking=ChunkingSpec(unit="word")
... )
>>> # Self-paced by noun phrases
>>> PresentationSpec(
...     mode="self_paced",
...     chunking=ChunkingSpec(
...         unit="constituent",
...         parse_type="constituency",
...         constituent_labels=["NP"],
...         parser="stanza",
...         parse_language="en"
...     )
... )
>>> # RSVP (timed sequence)
>>> PresentationSpec(
...     mode="timed_sequence",
...     chunking=ChunkingSpec(unit="word"),
...     timing=TimingParams(duration_ms=250, isi_ms=50, cumulative=False)
... )

`ItemElement` ¶

Bases: BeadBaseModel

A structured element within an item template.

ItemElements represent distinct parts of a complex item, such as context, target sentence, question, or response options. Elements can be static text or references to filled templates.

Attributes:

Name	Type	Description
`element_type`	`ElementRefType`	Type of element ("text" or "filled_template_ref").
`element_name`	`str`	Unique name for this element within the item.
`content`	`str \| None`	Static text content (for text elements).
`filled_template_ref_id`	`UUID \| None`	UUID of filled template (for reference elements).
`element_metadata`	`dict[str, MetadataValue]`	Additional element-specific metadata.
`order`	`int \| None`	Display order for this element (optional).

Examples:

>>> # Text element
>>> context = ItemElement(
...     element_type="text",
...     element_name="context",
...     content="Mary loves books.",
...     order=1
... )
>>> # Template reference element
>>> target = ItemElement(
...     element_type="filled_template_ref",
...     element_name="target",
...     filled_template_ref_id=UUID("..."),
...     order=2
... )

`is_text: bool` `property` ¶

Check if this is a text element.

Returns:

Type	Description
`bool`	True if element_type is "text".

`is_template_ref: bool` `property` ¶

Check if this references a filled template.

Returns:

Type	Description
`bool`	True if element_type is "filled_template_ref".

`validate_element_name(v: str) -> str` `classmethod` ¶

Validate element name is not empty.

Parameters:

Name	Type	Description	Default
`v`	`str`	Element name to validate.	required

Returns:

Type	Description
`str`	Validated element name.

Raises:

Type	Description
`ValueError`	If name is empty or contains only whitespace.

`ItemTemplate` ¶

Bases: BeadBaseModel

Template specification for constructing experimental items.

ItemTemplate defines how to construct an experimental item with three orthogonal dimensions: what semantic property to measure (judgment_type), how to collect the response (task_type), and how to present the stimulus (presentation_spec).

This is distinct from Template (in bead.resources.structures), which defines linguistic structure. ItemTemplate defines experimental structure.

Attributes:

Name	Type	Description
`name`	`str`	Template name (e.g., "acceptability_rating").
`description`	`str \| None`	Human-readable description of this item template.
`judgment_type`	`JudgmentType`	Semantic property being measured (acceptability, inference, etc.).
`task_type`	`TaskType`	Response collection method (forced_choice, ordinal_scale, etc.).
`elements`	`list[ItemElement]`	Elements that compose this item.
`constraints`	`list[UUID]`	UUIDs of constraints on items (typically model-based).
`task_spec`	`TaskSpec`	Task-specific parameters (prompt, options, scale bounds, etc.).
`presentation_spec`	`PresentationSpec`	Specification of how to present stimuli.
`presentation_order`	`list[str] \| None`	Order to present elements (by element_name).
`template_metadata`	`dict[str, MetadataValue]`	Additional template metadata.

Examples:

>>> # Acceptability judgment with ordinal scale task
>>> template = ItemTemplate(
...     name="acceptability_rating",
...     judgment_type="acceptability",
...     task_type="ordinal_scale",
...     task_spec=TaskSpec(
...         prompt="How natural is this sentence?",
...         scale_bounds=(1, 7),
...         scale_labels={1: "Very unnatural", 7: "Very natural"}
...     ),
...     presentation_spec=PresentationSpec(mode="static"),
...     elements=[
...         ItemElement(
...             element_type="filled_template_ref",
...             element_name="sentence",
...             filled_template_ref_id=UUID("...")
...         )
...     ]
... )
>>> # Minimal pair: acceptability judgment with forced choice task
>>> minimal_pair = ItemTemplate(
...     name="minimal_pair",
...     judgment_type="acceptability",
...     task_type="forced_choice",
...     elements=[
...         ItemElement(
...             element_type="text", element_name="sent_a", content="Who..."
...         ),
...         ItemElement(
...             element_type="text", element_name="sent_b", content="Whom..."
...         )
...     ],
...     task_spec=TaskSpec(
...         prompt="Which sounds more natural?",
...         options=["sent_a", "sent_b"]
...     ),
...     presentation_spec=PresentationSpec(mode="static")
... )
>>> # Odd-man-out: similarity judgment with forced choice task
>>> odd_man_out = ItemTemplate(
...     name="odd_man_out",
...     judgment_type="similarity",
...     task_type="forced_choice",
...     elements=[...],  # 4 elements
...     task_spec=TaskSpec(
...         prompt="Which is most different?",
...         options=["opt_a", "opt_b", "opt_c", "opt_d"]
...     ),
...     presentation_spec=PresentationSpec(mode="static")
... )

`validate_name(v: str) -> str` `classmethod` ¶

Validate template name is not empty.

Parameters:

Name	Type	Description	Default
`v`	`str`	Template name to validate.	required

Returns:

Type	Description
`str`	Validated template name.

Raises:

Type	Description
`ValueError`	If name is empty or contains only whitespace.

`validate_unique_element_names(v: list[ItemElement]) -> list[ItemElement]` `classmethod` ¶

Validate all element names are unique within template.

Parameters:

Name	Type	Description	Default
`v`	`list[ItemElement]`	List of elements to validate.	required

Returns:

Type	Description
`list[ItemElement]`	Validated elements.

Raises:

Type	Description
`ValueError`	If duplicate element names found.

`validate_presentation_order(v: list[str] | None, info: ValidationInfo) -> list[str] | None` `classmethod` ¶

Validate presentation_order matches element names.

Parameters:

Name	Type	Description	Default
`v`	`list[str] \| None`	Presentation order list to validate.	required
`info`	`ValidationInfo`	Pydantic validation info containing other field values.	required

Returns:

Type	Description
`list[str] \| None`	Validated presentation order.

Raises:

Type	Description
`ValueError`	If presentation_order contains names not in elements, or is missing names from elements.

`get_element_by_name(name: str) -> ItemElement | None` ¶

Get an element by its name.

Parameters:

Name	Type	Description	Default
`name`	`str`	Element name to search for.	required

Returns:

Type	Description
`ItemElement \| None`	Element with matching name, or None if not found.

Examples:

>>> elem = template.get_element_by_name("sentence")
>>> if elem:
...     print(elem.element_type)

`get_template_ref_elements() -> list[ItemElement]` ¶

Get all elements that reference filled templates.

Returns:

Type	Description
`list[ItemElement]`	Elements with element_type="filled_template_ref".

Examples:

>>> refs = template.get_template_ref_elements()
>>> print(f"Found {len(refs)} template references")

`ItemTemplateCollection` ¶

Bases: BeadBaseModel

A collection of item templates.

Attributes:

Name	Type	Description
`name`	`str`	Name of this collection.
`description`	`str \| None`	Description of this collection.
`templates`	`list[ItemTemplate]`	Item templates in this collection.

Examples:

>>> collection = ItemTemplateCollection(
...     name="acceptability_study",
...     description="Templates for acceptability judgments"
... )
>>> collection.add_template(template)

`validate_name(v: str) -> str` `classmethod` ¶

Validate collection name is not empty.

Parameters:

Name	Type	Description	Default
`v`	`str`	Collection name to validate.	required

Returns:

Type	Description
`str`	Validated collection name.

Raises:

Type	Description
`ValueError`	If name is empty or contains only whitespace.

`add_template(template: ItemTemplate) -> None` ¶

Add a template to the collection.

Parameters:

Name	Type	Description	Default
`template`	`ItemTemplate`	Template to add.	required

Examples:

>>> collection.add_template(my_template)
>>> print(f"Collection now has {len(collection.templates)} templates")

Task-Type Utilities¶

`forced_choice` ¶

Utilities for creating N-AFC (forced-choice) experimental items.

This module provides language-agnostic utilities for creating forced-choice items where participants select from N alternatives (2AFC, 3AFC, 4AFC, etc.).

`create_forced_choice_item(*options: str, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item` ¶

Create an N-AFC (forced-choice) item from N text options.

Parameters:

Name	Type	Description	Default
`*options`	`str`	Text for each option (2 or more required).	`()`
`item_template_id`	`UUID \| None`	Template ID for the item. If None, generates new UUID.	`None`
`metadata`	`dict[str, MetadataValue] \| None`	Additional metadata for item_metadata field.	`None`

Returns:

Type	Description
`Item`	Forced-choice item with options stored in the options field.

Raises:

Type	Description
`ValueError`	If fewer than 2 options provided.

Examples:

>>> item = create_forced_choice_item(
...     "The cat sat on the mat.",
...     "The cats sat on the mat.",
...     metadata={"contrast": "number"}
... )
>>> item.options[0]
'The cat sat on the mat.'
>>> item.options[1]
'The cats sat on the mat.'

>>> # 4AFC item
>>> item = create_forced_choice_item(
...     "Option A text",
...     "Option B text",
...     "Option C text",
...     "Option D text"
... )
>>> len(item.options)
4

`create_forced_choice_items_from_groups(items: list[Item], group_by: Callable[[Item], Any], n_alternatives: int = 2, *, extract_text: Callable[[Item], str] | None = None, include_group_metadata: bool = True, item_template_id: UUID | None = None) -> list[Item]` ¶

Create forced-choice items by grouping source items.

Groups items by a property, then creates all N-way combinations within each group as forced-choice items.

Parameters:

Name	Type	Description	Default
`items`	`list[Item]`	Source items to group and combine.	required
`group_by`	`Callable[[Item], Any]`	Function to extract grouping key from items.	required
`n_alternatives`	`int`	Number of alternatives per forced-choice item (default: 2 for 2AFC).	`2`
`extract_text`	`Callable[[Item], str] \| None`	Function to extract text from item. If None, tries common keys ("text", "sentence", "content") from rendered_elements.	`None`
`include_group_metadata`	`bool`	Whether to include group key in item metadata.	`True`
`item_template_id`	`UUID \| None`	Template ID for all created items. If None, generates one per item.	`None`

Returns:

Type	Description
`list[Item]`	Forced-choice items created from groupings.

Examples:

Create 2AFC items with same verb (same-verb minimal pairs):

>>> items = [
...     Item(
...         item_template_id=uuid4(),
...         rendered_elements={"text": "She walks."},
...         item_metadata={"verb": "walk", "frame": "intransitive"}
...     ),
...     Item(
...         item_template_id=uuid4(),
...         rendered_elements={"text": "She walks the dog."},
...         item_metadata={"verb": "walk", "frame": "transitive"}
...     )
... ]
>>> fc_items = create_forced_choice_items_from_groups(
...     items,
...     group_by=lambda item: item.item_metadata["verb"],
...     n_alternatives=2
... )
>>> len(fc_items)
1
>>> fc_items[0].rendered_elements["option_a"]
'She walks.'

Create 3AFC items grouped by template:

>>> fc_items = create_forced_choice_items_from_groups(
...     items,
...     group_by=lambda item: item.item_template_id,
...     n_alternatives=3
... )

`create_forced_choice_items_cross_product(group1_items: list[Item], group2_items: list[Item], n_from_group1: int = 1, n_from_group2: int = 1, *, extract_text: Callable[[Item], str] | None = None, item_template_id: UUID | None = None, metadata_fn: Callable[[list[Item], list[Item]], dict[str, MetadataValue]] | None = None) -> list[Item]` ¶

Create forced-choice items from cross-product of two groups.

Combines n items from group1 with n items from group2 to create (n_from_group1 + n_from_group2)-AFC items.

Parameters:

Name	Type	Description	Default
`group1_items`	`list[Item]`	Items in first group.	required
`group2_items`	`list[Item]`	Items in second group.	required
`n_from_group1`	`int`	Number of items to select from group1 per combination (default: 1).	`1`
`n_from_group2`	`int`	Number of items to select from group2 per combination (default: 1).	`1`
`extract_text`	`Callable[[Item], str] \| None`	Function to extract text from items.	`None`
`item_template_id`	`UUID \| None`	Template ID for all created items.	`None`
`metadata_fn`	`Callable[[list[Item], list[Item]], dict[str, MetadataValue]] \| None`	Function to generate metadata from (group1_items_used, group2_items_used).	`None`

Returns:

Type	Description
`list[Item]`	Forced-choice items from cross-product.

Examples:

Create 2AFC items pairing grammatical with ungrammatical:

>>> grammatical = [
...     Item(
...         uuid4(),
...         rendered_elements={"text": "She walks."},
...         item_metadata={"grammatical": True}
...     )
... ]
>>> ungrammatical = [
...     Item(
...         uuid4(),
...         rendered_elements={"text": "She walk."},
...         item_metadata={"grammatical": False}
...     )
... ]
>>> fc_items = create_forced_choice_items_cross_product(
...     grammatical,
...     ungrammatical,
...     n_from_group1=1,
...     n_from_group2=1
... )
>>> len(fc_items)
1

`create_filtered_forced_choice_items(items: list[Item], group_by: Callable[[Item], Any], n_alternatives: int = 2, *, item_filter: Callable[[Item], bool] | None = None, group_filter: Callable[[Any, list[Item]], bool] | None = None, combination_filter: Callable[[tuple[Item, ...]], bool] | None = None, extract_text: Callable[[Item], str] | None = None, item_template_id: UUID | None = None) -> list[Item]` ¶

Create forced-choice items with multi-level filtering.

Parameters:

Name	Type	Description	Default
`items`	`list[Item]`	Source items.	required
`group_by`	`Callable[[Item], Any]`	Grouping function.	required
`n_alternatives`	`int`	Number of alternatives per item.	`2`
`item_filter`	`Callable[[Item], bool] \| None`	Filter individual items before grouping.	`None`
`group_filter`	`Callable[[Any, list[Item]], bool] \| None`	Filter groups (receives group_key and group_items).	`None`
`combination_filter`	`Callable[[tuple[Item, ...]], bool] \| None`	Filter specific combinations.	`None`
`extract_text`	`Callable[[Item], str] \| None`	Text extraction function.	`None`
`item_template_id`	`UUID \| None`	Template ID for created items.	`None`

Returns:

Type	Description
`list[Item]`	Filtered forced-choice items.

Examples:

>>> fc_items = create_filtered_forced_choice_items(
...     items,
...     group_by=lambda i: i.item_metadata["verb"],
...     n_alternatives=2,
...     item_filter=lambda i: i.item_metadata.get("valid", True),
...     group_filter=lambda key, items: len(items) >= 2,
...     combination_filter=lambda combo: combo[0].id != combo[1].id
... )

`ordinal_scale` ¶

Utilities for creating ordinal scale experimental items.

This module provides language-agnostic utilities for creating ordinal scale items where participants rate a single stimulus on an ordered discrete scale (e.g., 1-7 Likert scale, acceptability ratings).

Integration Points

Active Learning: bead/active_learning/models/ordinal_scale.py
Simulation: bead/simulation/strategies/ordinal_scale.py
Deployment: bead/deployment/jspsych/ (slider or radio buttons)

`create_ordinal_scale_item(text: str, scale_bounds: tuple[int, int] = (1, 7), prompt: str | None = None, scale_labels: dict[int, str] | None = None, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item` ¶

Create an ordinal scale rating item.

Parameters:

Name	Type	Description	Default
`text`	`str`	The stimulus text to rate.	required
`scale_bounds`	`tuple[int, int]`	Tuple of (min, max) for the scale. Both must be integers with min < max. Default: (1, 7) for a 7-point scale.	`(1, 7)`
`prompt`	`str \| None`	Optional question/prompt for the rating. If None, uses "Rate this item:".	`None`
`scale_labels`	`dict[int, str] \| None`	Optional labels for specific scale values (e.g., {1: "Bad", 7: "Good"}). All keys must be within [scale_min, scale_max].	`None`
`item_template_id`	`UUID \| None`	Template ID for the item. If None, generates new UUID.	`None`
`metadata`	`dict[str, MetadataValue] \| None`	Additional metadata for item_metadata field.	`None`

Returns:

Type	Description
`Item`	Ordinal scale item with text and prompt in rendered_elements.

Raises:

Type	Description
`ValueError`	If text is empty, if scale_bounds are invalid, or if scale_labels contain values outside scale bounds.

Examples:

>>> item = create_ordinal_scale_item(
...     text="The cat sat on the mat.",
...     scale_bounds=(1, 7),
...     prompt="How natural is this sentence?",
...     metadata={"task": "acceptability"}
... )
>>> item.rendered_elements["text"]
'The cat sat on the mat.'
>>> item.item_metadata["scale_min"]
1
>>> item.item_metadata["scale_max"]
7

>>> # 5-point Likert with labels
>>> item = create_ordinal_scale_item(
...     text="I enjoy linguistics.",
...     scale_bounds=(1, 5),
...     scale_labels={1: "Strongly Disagree", 5: "Strongly Agree"}
... )
>>> item.item_metadata["scale_labels"][1]
'Strongly Disagree'

`create_ordinal_scale_items_from_texts(texts: list[str], scale_bounds: tuple[int, int] = (1, 7), prompt: str | None = None, scale_labels: dict[int, str] | None = None, *, item_template_id: UUID | None = None, metadata_fn: Callable[[str], dict[str, MetadataValue]] | None = None) -> list[Item]` ¶

Create ordinal scale items from a list of texts.

Parameters:

Name	Type	Description	Default
`texts`	`list[str]`	List of stimulus texts.	required
`scale_bounds`	`tuple[int, int]`	Scale bounds (min, max) for all items.	`(1, 7)`
`prompt`	`str \| None`	The question/prompt for all items.	`None`
`scale_labels`	`dict[int, str] \| None`	Optional scale labels for all items.	`None`
`item_template_id`	`UUID \| None`	Template ID for all created items. If None, generates one per item.	`None`
`metadata_fn`	`Callable[[str], dict[str, MetadataValue]] \| None`	Function to generate metadata from each text.	`None`

Returns:

Type	Description
`list[Item]`	Ordinal scale items for each text.

Examples:

>>> texts = ["She walks.", "She walk.", "They walk."]
>>> items = create_ordinal_scale_items_from_texts(
...     texts,
...     scale_bounds=(1, 5),
...     prompt="How acceptable is this sentence?",
...     metadata_fn=lambda t: {"text_length": len(t)}
... )
>>> len(items)
3
>>> items[0].item_metadata["scale_min"]
1

`create_ordinal_scale_items_from_groups(items: list[Item], group_by: Callable[[Item], Hashable], scale_bounds: tuple[int, int] = (1, 7), prompt: str | None = None, scale_labels: dict[int, str] | None = None, *, extract_text: Callable[[Item], str] | None = None, include_group_metadata: bool = True, item_template_id: UUID | None = None) -> list[Item]` ¶

Create ordinal scale items from grouped source items.

Groups items and creates one ordinal scale item per source item, preserving group information in metadata.

Parameters:

Name	Type	Description	Default
`items`	`list[Item]`	Source items to process.	required
`group_by`	`Callable[[Item], Hashable]`	Function to extract grouping key from items.	required
`scale_bounds`	`tuple[int, int]`	Scale bounds (min, max) for all items.	`(1, 7)`
`prompt`	`str \| None`	The question/prompt for all items.	`None`
`scale_labels`	`dict[int, str] \| None`	Optional scale labels for all items.	`None`
`extract_text`	`Callable[[Item], str] \| None`	Function to extract text from item. If None, tries common keys.	`None`
`include_group_metadata`	`bool`	Whether to include group key in item metadata.	`True`
`item_template_id`	`UUID \| None`	Template ID for all created items. If None, generates one per item.	`None`

Returns:

Type	Description
`list[Item]`	Ordinal scale items from source items.

Examples:

>>> source_items = [
...     Item(
...         uuid4(),
...         rendered_elements={"text": "She walks."},
...         item_metadata={"verb": "walk"}
...     )
... ]
>>> ordinal_items = create_ordinal_scale_items_from_groups(
...     source_items,
...     group_by=lambda i: i.item_metadata["verb"],
...     scale_bounds=(1, 7),
...     prompt="Rate the acceptability:"
... )
>>> len(ordinal_items)
1

`create_ordinal_scale_items_cross_product(texts: list[str], prompts: list[str], scale_bounds: tuple[int, int] = (1, 7), scale_labels: dict[int, str] | None = None, *, item_template_id: UUID | None = None, metadata_fn: Callable[[str, str], dict[str, MetadataValue]] | None = None) -> list[Item]` ¶

Create ordinal scale items from cross-product of texts and prompts.

Useful when you want to apply multiple prompts to each text.

Parameters:

Name	Type	Description	Default
`texts`	`list[str]`	List of stimulus texts.	required
`prompts`	`list[str]`	List of prompts to apply.	required
`scale_bounds`	`tuple[int, int]`	Scale bounds (min, max) for all items.	`(1, 7)`
`scale_labels`	`dict[int, str] \| None`	Optional scale labels for all items.	`None`
`item_template_id`	`UUID \| None`	Template ID for all created items.	`None`
`metadata_fn`	`Callable[[str, str], dict[str, MetadataValue]] \| None`	Function to generate metadata from (text, prompt).	`None`

Returns:

Type	Description
`list[Item]`	Ordinal scale items from cross-product.

Examples:

>>> texts = ["The cat sat.", "The dog ran."]
>>> prompts = ["How natural is this?", "How acceptable is this?"]
>>> items = create_ordinal_scale_items_cross_product(
...     texts, prompts, scale_bounds=(1, 5)
... )
>>> len(items)
4

`create_filtered_ordinal_scale_items(items: list[Item], scale_bounds: tuple[int, int] = (1, 7), prompt: str | None = None, scale_labels: dict[int, str] | None = None, *, item_filter: Callable[[Item], bool] | None = None, extract_text: Callable[[Item], str] | None = None, item_template_id: UUID | None = None) -> list[Item]` ¶

Create ordinal scale items with filtering.

Parameters:

Name	Type	Description	Default
`items`	`list[Item]`	Source items.	required
`scale_bounds`	`tuple[int, int]`	Scale bounds (min, max) for all items.	`(1, 7)`
`prompt`	`str \| None`	The question/prompt for all items.	`None`
`scale_labels`	`dict[int, str] \| None`	Optional scale labels for all items.	`None`
`item_filter`	`Callable[[Item], bool] \| None`	Filter individual items.	`None`
`extract_text`	`Callable[[Item], str] \| None`	Text extraction function.	`None`
`item_template_id`	`UUID \| None`	Template ID for created items.	`None`

Returns:

Type	Description
`list[Item]`	Filtered ordinal scale items.

Examples:

>>> ordinal_items = create_filtered_ordinal_scale_items(
...     items,
...     scale_bounds=(1, 7),
...     prompt="Rate the acceptability:",
...     item_filter=lambda i: i.item_metadata.get("valid", True)
... )

`create_likert_5_item(text: str, prompt: str | None = None, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item` ¶

Create a 5-point Likert scale item.

Convenience function for standard 5-point Likert scale with "Strongly Disagree" to "Strongly Agree" labels.

Parameters:

Name	Type	Description	Default
`text`	`str`	The stimulus text (statement) to rate.	required
`prompt`	`str \| None`	Optional prompt. If None, uses "Rate your agreement:".	`None`
`item_template_id`	`UUID \| None`	Template ID for the item. If None, generates new UUID.	`None`
`metadata`	`dict[str, MetadataValue] \| None`	Additional metadata for item_metadata field.	`None`

Returns:

Type	Description
`Item`	5-point Likert scale item.

Examples:

>>> item = create_likert_5_item("I enjoy studying linguistics.")
>>> item.item_metadata["scale_min"]
1
>>> item.item_metadata["scale_max"]
5

`create_likert_7_item(text: str, prompt: str | None = None, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item` ¶

Create a 7-point Likert scale item.

Convenience function for standard 7-point Likert scale with "Strongly Disagree" to "Strongly Agree" labels.

Parameters:

Name	Type	Description	Default
`text`	`str`	The stimulus text (statement) to rate.	required
`prompt`	`str \| None`	Optional prompt. If None, uses "Rate your agreement:".	`None`
`item_template_id`	`UUID \| None`	Template ID for the item. If None, generates new UUID.	`None`
`metadata`	`dict[str, MetadataValue] \| None`	Additional metadata for item_metadata field.	`None`

Returns:

Type	Description
`Item`	7-point Likert scale item.

Examples:

>>> item = create_likert_7_item("I enjoy studying linguistics.")
>>> item.item_metadata["scale_min"]
1
>>> item.item_metadata["scale_max"]
7

`binary` ¶

Utilities for creating binary experimental items.

This module provides language-agnostic utilities for creating binary items where participants make yes/no or true/false judgments about a single stimulus.

IMPORTANT: Binary tasks are semantically distinct from 2AFC tasks: - Binary: Absolute judgment about single stimulus ("Is this grammatical?") - 2AFC: Relative choice between two stimuli ("Which is more natural?")

Integration Points

Active Learning: bead/active_learning/models/binary.py
Simulation: bead/simulation/strategies/binary.py
Deployment: bead/deployment/jspsych/ (binary button plugin)

`create_binary_item(text: str, prompt: str = 'Yes/No?', binary_options: tuple[str, str] = ('yes', 'no'), item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item` ¶

Create a binary judgment item for a single stimulus.

Parameters:

Name	Type	Description	Default
`text`	`str`	The stimulus text to judge.	required
`prompt`	`str`	The question/prompt for the judgment (default: "Yes/No?").	`'Yes/No?'`
`binary_options`	`tuple[str, str]`	The two response options (default: ("yes", "no")). Can also be ("true", "false"), ("acceptable", "unacceptable"), etc.	`('yes', 'no')`
`item_template_id`	`UUID \| None`	Template ID for the item. If None, generates new UUID.	`None`
`metadata`	`dict[str, MetadataValue] \| None`	Additional metadata for item_metadata field.	`None`

Returns:

Type	Description
`Item`	Binary item with text and prompt in rendered_elements.

Raises:

Type	Description
`ValueError`	If text is empty or if binary_options doesn't have exactly 2 values.

Examples:

>>> item = create_binary_item(
...     "The cat sat on the mat.",
...     prompt="Is this sentence grammatical?",
...     metadata={"judgment": "grammaticality"}
... )
>>> item.rendered_elements["text"]
'The cat sat on the mat.'
>>> item.rendered_elements["prompt"]
'Is this sentence grammatical?'
>>> item.item_metadata["binary_options"]
['yes', 'no']

>>> # Truth value judgment
>>> item = create_binary_item(
...     "The sky is blue.",
...     prompt="Is this statement true?",
...     binary_options=("true", "false")
... )
>>> item.item_metadata["binary_options"]
['true', 'false']

`create_binary_items_from_texts(texts: list[str], prompt: str, binary_options: tuple[str, str] = ('yes', 'no'), *, item_template_id: UUID | None = None, metadata_fn: Callable[[str], dict[str, MetadataValue]] | None = None) -> list[Item]` ¶

Create binary items from a list of texts with the same prompt.

Parameters:

Name	Type	Description	Default
`texts`	`list[str]`	List of stimulus texts.	required
`prompt`	`str`	The question/prompt for all items.	required
`binary_options`	`tuple[str, str]`	The two response options (default: ("yes", "no")).	`('yes', 'no')`
`item_template_id`	`UUID \| None`	Template ID for all created items. If None, generates one per item.	`None`
`metadata_fn`	`Callable[[str], dict[str, MetadataValue]] \| None`	Function to generate metadata from each text.	`None`

Returns:

Type	Description
`list[Item]`	Binary items for each text.

Examples:

>>> texts = [
...     "She walks.",
...     "She walk.",
...     "They walk.",
...     "They walks."
... ]
>>> items = create_binary_items_from_texts(
...     texts,
...     prompt="Is this sentence grammatical?",
...     binary_options=("yes", "no")
... )
>>> len(items)
4
>>> items[0].rendered_elements["text"]
'She walks.'

`create_binary_items_with_context(contexts: list[str], targets: list[str], prompt: str, binary_options: tuple[str, str] = ('yes', 'no'), *, context_label: str = 'Context', target_label: str = 'Statement', item_template_id: UUID | None = None, metadata_fn: Callable[[str, str], dict[str, MetadataValue]] | None = None) -> list[Item]` ¶

Create binary items with context + target structure.

Useful for judgments like "Given context X, is statement Y true?".

Parameters:

Name	Type	Description	Default
`contexts`	`list[str]`	Context texts (same length as targets).	required
`targets`	`list[str]`	Target texts to judge given context.	required
`prompt`	`str`	The question/prompt for the judgment.	required
`binary_options`	`tuple[str, str]`	The two response options (default: ("yes", "no")).	`('yes', 'no')`
`context_label`	`str`	Label for context in rendered text (default: "Context").	`'Context'`
`target_label`	`str`	Label for target in rendered text (default: "Statement").	`'Statement'`
`item_template_id`	`UUID \| None`	Template ID for all created items. If None, generates one per item.	`None`
`metadata_fn`	`Callable[[str, str], dict[str, MetadataValue]] \| None`	Function to generate metadata from (context, target).	`None`

Returns:

Type	Description
`list[Item]`	Binary items with context + target structure.

Raises:

Type	Description
`ValueError`	If contexts and targets have different lengths.

Examples:

>>> contexts = ["The dog barked loudly."]
>>> targets = ["The dog made a sound."]
>>> items = create_binary_items_with_context(
...     contexts,
...     targets,
...     prompt="Is the statement true given the context?",
...     binary_options=("true", "false")
... )
>>> len(items)
1
>>> "Context:" in items[0].rendered_elements["text"]
True

`create_binary_items_from_groups(items: list[Item], group_by: Callable[[Item], Hashable], prompt: str, binary_options: tuple[str, str] = ('yes', 'no'), *, extract_text: Callable[[Item], str] | None = None, include_group_metadata: bool = True, item_template_id: UUID | None = None) -> list[Item]` ¶

Create binary items from grouped source items.

Groups items and creates one binary item per source item, preserving group information in metadata.

Parameters:

Name	Type	Description	Default
`items`	`list[Item]`	Source items to process.	required
`group_by`	`Callable[[Item], Hashable]`	Function to extract grouping key from items.	required
`prompt`	`str`	The question/prompt for all items.	required
`binary_options`	`tuple[str, str]`	The two response options (default: ("yes", "no")).	`('yes', 'no')`
`extract_text`	`Callable[[Item], str] \| None`	Function to extract text from item. If None, tries common keys.	`None`
`include_group_metadata`	`bool`	Whether to include group key in item metadata.	`True`
`item_template_id`	`UUID \| None`	Template ID for all created items. If None, generates one per item.	`None`

Returns:

Type	Description
`list[Item]`	Binary items from source items.

Examples:

>>> source_items = [
...     Item(
...         uuid4(),
...         rendered_elements={"text": "She walks."},
...         item_metadata={"verb": "walk"}
...     ),
...     Item(
...         uuid4(),
...         rendered_elements={"text": "She runs."},
...         item_metadata={"verb": "run"}
...     )
... ]
>>> binary_items = create_binary_items_from_groups(
...     source_items,
...     group_by=lambda i: i.item_metadata["verb"],
...     prompt="Is this sentence grammatical?"
... )
>>> len(binary_items)
2

`create_binary_items_cross_product(texts: list[str], prompts: list[str], binary_options: tuple[str, str] = ('yes', 'no'), *, item_template_id: UUID | None = None, metadata_fn: Callable[[str, str], dict[str, MetadataValue]] | None = None) -> list[Item]` ¶

Create binary items from cross-product of texts and prompts.

Useful when you want to apply multiple prompts to each text.

Parameters:

Name	Type	Description	Default
`texts`	`list[str]`	List of stimulus texts.	required
`prompts`	`list[str]`	List of prompts to apply.	required
`binary_options`	`tuple[str, str]`	The two response options (default: ("yes", "no")).	`('yes', 'no')`
`item_template_id`	`UUID \| None`	Template ID for all created items.	`None`
`metadata_fn`	`Callable[[str, str], dict[str, MetadataValue]] \| None`	Function to generate metadata from (text, prompt).	`None`

Returns:

Type	Description
`list[Item]`	Binary items from cross-product.

Examples:

>>> texts = ["The cat sat.", "The dog ran."]
>>> prompts = ["Is this grammatical?", "Is this natural?"]
>>> items = create_binary_items_cross_product(texts, prompts)
>>> len(items)
4

`create_filtered_binary_items(items: list[Item], prompt: str, binary_options: tuple[str, str] = ('yes', 'no'), *, item_filter: Callable[[Item], bool] | None = None, extract_text: Callable[[Item], str] | None = None, item_template_id: UUID | None = None) -> list[Item]` ¶

Create binary items with filtering.

Parameters:

Name	Type	Description	Default
`items`	`list[Item]`	Source items.	required
`prompt`	`str`	The question/prompt for all items.	required
`binary_options`	`tuple[str, str]`	The two response options (default: ("yes", "no")).	`('yes', 'no')`
`item_filter`	`Callable[[Item], bool] \| None`	Filter individual items.	`None`
`extract_text`	`Callable[[Item], str] \| None`	Text extraction function.	`None`
`item_template_id`	`UUID \| None`	Template ID for created items.	`None`

Returns:

Type	Description
`list[Item]`	Filtered binary items.

Examples:

>>> binary_items = create_filtered_binary_items(
...     items,
...     prompt="Is this grammatical?",
...     item_filter=lambda i: i.item_metadata.get("valid", True)
... )

`categorical` ¶

Utilities for creating categorical experimental items.

This module provides language-agnostic utilities for creating categorical items where participants select from N unordered categories (e.g., NLI labels, POS tags, semantic relations).

Integration Points

Active Learning: bead/active_learning/models/categorical.py
Simulation: bead/simulation/strategies/categorical.py
Deployment: bead/deployment/jspsych/ (dropdown or radio buttons)

`create_categorical_item(text: str, categories: list[str], prompt: str | None = None, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item` ¶

Create a categorical classification item.

Parameters:

Name	Type	Description	Default
`text`	`str`	The stimulus text to classify.	required
`categories`	`list[str]`	List of category labels (unordered). Must have at least 2 categories.	required
`prompt`	`str \| None`	Optional question/prompt for the classification. If None, uses "Select a category:".	`None`
`item_template_id`	`UUID \| None`	Template ID for the item. If None, generates new UUID.	`None`
`metadata`	`dict[str, MetadataValue] \| None`	Additional metadata for item_metadata field.	`None`

Returns:

Type	Description
`Item`	Categorical item with text and prompt in rendered_elements.

Raises:

Type	Description
`ValueError`	If text is empty or if fewer than 2 categories provided.

Examples:

>>> item = create_categorical_item(
...     text="Premise: All dogs bark. Hypothesis: Some dogs bark.",
...     categories=["entailment", "neutral", "contradiction"],
...     prompt="What is the relationship?",
...     metadata={"task": "nli"}
... )
>>> item.rendered_elements["text"]
'Premise: All dogs bark. Hypothesis: Some dogs bark.'
>>> item.rendered_elements["prompt"]
'What is the relationship?'
>>> item.item_metadata["categories"]
['entailment', 'neutral', 'contradiction']

>>> # POS tagging
>>> item = create_categorical_item(
...     text="The cat sat on the mat.",
...     categories=["noun", "verb", "adjective", "determiner", "preposition"],
...     prompt="What is the part of speech of 'cat'?"
... )
>>> len(item.item_metadata["categories"])
5

`create_nli_item(premise: str, hypothesis: str, categories: list[str] | None = None, prompt: str | None = None, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item` ¶

Create a Natural Language Inference (NLI) item.

Specialized helper for NLI tasks with automatic formatting and default categories.

Parameters:

Name	Type	Description	Default
`premise`	`str`	The premise text.	required
`hypothesis`	`str`	The hypothesis text.	required
`categories`	`list[str] \| None`	Category labels. If None, uses ["entailment", "neutral", "contradiction"].	`None`
`prompt`	`str \| None`	Question/prompt. If None, uses "What is the relationship?".	`None`
`item_template_id`	`UUID \| None`	Template ID for the item. If None, generates new UUID.	`None`
`metadata`	`dict[str, MetadataValue] \| None`	Additional metadata for item_metadata field.	`None`

Returns:

Type	Description
`Item`	NLI categorical item.

Examples:

>>> item = create_nli_item(
...     premise="All dogs bark.",
...     hypothesis="Some dogs bark."
... )
>>> "Premise:" in item.rendered_elements["text"]
True
>>> "Hypothesis:" in item.rendered_elements["text"]
True
>>> item.item_metadata["categories"]
['entailment', 'neutral', 'contradiction']
>>> item.item_metadata["premise"]
'All dogs bark.'

>>> # Custom categories
>>> item = create_nli_item(
...     premise="The cat is on the mat.",
...     hypothesis="There is an animal on the mat.",
...     categories=["entails", "contradicts", "neither"]
... )
>>> item.item_metadata["categories"]
['entails', 'contradicts', 'neither']

`create_categorical_items_from_texts(texts: list[str], categories: list[str], prompt: str | None = None, *, item_template_id: UUID | None = None, metadata_fn: Callable[[str], dict[str, MetadataValue]] | None = None) -> list[Item]` ¶

Create categorical items from a list of texts with the same categories.

Parameters:

Name	Type	Description	Default
`texts`	`list[str]`	List of stimulus texts.	required
`categories`	`list[str]`	Category labels for all items.	required
`prompt`	`str \| None`	The question/prompt for all items.	`None`
`item_template_id`	`UUID \| None`	Template ID for all created items. If None, generates one per item.	`None`
`metadata_fn`	`Callable[[str], dict[str, MetadataValue]] \| None`	Function to generate metadata from each text.	`None`

Returns:

Type	Description
`list[Item]`	Categorical items for each text.

Examples:

>>> texts = ["The cat sat.", "The dog ran.", "The bird flew."]
>>> categories = ["past", "present", "future"]
>>> items = create_categorical_items_from_texts(
...     texts,
...     categories=categories,
...     prompt="What is the tense?"
... )
>>> len(items)
3
>>> items[0].item_metadata["categories"]
['past', 'present', 'future']

`create_categorical_items_from_pairs(pairs: list[tuple[str, str]], categories: list[str], prompt: str | None = None, *, pair_label1: str = 'Text 1', pair_label2: str = 'Text 2', item_template_id: UUID | None = None, metadata_fn: Callable[[str, str], dict[str, MetadataValue]] | None = None) -> list[Item]` ¶

Create categorical items from pairs of texts.

Useful for NLI, paraphrase detection, semantic similarity, etc.

Parameters:

Name	Type	Description	Default
`pairs`	`list[tuple[str, str]]`	List of (text1, text2) pairs.	required
`categories`	`list[str]`	Category labels for all items.	required
`prompt`	`str \| None`	The question/prompt for all items.	`None`
`pair_label1`	`str`	Label for first text in pair (default: "Text 1").	`'Text 1'`
`pair_label2`	`str`	Label for second text in pair (default: "Text 2").	`'Text 2'`
`item_template_id`	`UUID \| None`	Template ID for all created items. If None, generates one per item.	`None`
`metadata_fn`	`Callable[[str, str], dict[str, MetadataValue]] \| None`	Function to generate metadata from (text1, text2).	`None`

Returns:

Type	Description
`list[Item]`	Categorical items from pairs.

Examples:

>>> pairs = [
...     ("All dogs bark.", "Some dogs bark."),
...     ("The sky is blue.", "The sky is not blue.")
... ]
>>> items = create_categorical_items_from_pairs(
...     pairs,
...     categories=["entailment", "neutral", "contradiction"],
...     prompt="What is the relationship?",
...     pair_label1="Premise",
...     pair_label2="Hypothesis"
... )
>>> len(items)
2
>>> "Premise:" in items[0].rendered_elements["text"]
True

`create_categorical_items_from_groups(items: list[Item], group_by: Callable[[Item], Hashable], categories: list[str], prompt: str | None = None, *, extract_text: Callable[[Item], str] | None = None, include_group_metadata: bool = True, item_template_id: UUID | None = None) -> list[Item]` ¶

Create categorical items from grouped source items.

Groups items and creates one categorical item per source item, preserving group information in metadata.

Parameters:

Name	Type	Description	Default
`items`	`list[Item]`	Source items to process.	required
`group_by`	`Callable[[Item], Hashable]`	Function to extract grouping key from items.	required
`categories`	`list[str]`	Category labels for all items.	required
`prompt`	`str \| None`	The question/prompt for all items.	`None`
`extract_text`	`Callable[[Item], str] \| None`	Function to extract text from item. If None, tries common keys.	`None`
`include_group_metadata`	`bool`	Whether to include group key in item metadata.	`True`
`item_template_id`	`UUID \| None`	Template ID for all created items. If None, generates one per item.	`None`

Returns:

Type	Description
`list[Item]`	Categorical items from source items.

Examples:

>>> source_items = [
...     Item(
...         uuid4(),
...         rendered_elements={"text": "The cat sat."},
...         item_metadata={"tense": "past"}
...     ),
...     Item(
...         uuid4(),
...         rendered_elements={"text": "The dog runs."},
...         item_metadata={"tense": "present"}
...     )
... ]
>>> categorical_items = create_categorical_items_from_groups(
...     source_items,
...     group_by=lambda i: i.item_metadata["tense"],
...     categories=["past", "present", "future"],
...     prompt="What is the tense?"
... )
>>> len(categorical_items)
2

`create_categorical_items_cross_product(texts: list[str], prompts: list[str], categories: list[str], *, item_template_id: UUID | None = None, metadata_fn: Callable[[str, str], dict[str, MetadataValue]] | None = None) -> list[Item]` ¶

Create categorical items from cross-product of texts and prompts.

Useful when you want to apply multiple prompts to each text.

Parameters:

Name	Type	Description	Default
`texts`	`list[str]`	List of stimulus texts.	required
`prompts`	`list[str]`	List of prompts to apply.	required
`categories`	`list[str]`	Category labels for all items.	required
`item_template_id`	`UUID \| None`	Template ID for all created items.	`None`
`metadata_fn`	`Callable[[str, str], dict[str, MetadataValue]] \| None`	Function to generate metadata from (text, prompt).	`None`

Returns:

Type	Description
`list[Item]`	Categorical items from cross-product.

Examples:

>>> texts = ["The cat sat.", "The dog ran."]
>>> prompts = ["What is the tense?", "What is the aspect?"]
>>> categories = ["past", "present", "future"]
>>> items = create_categorical_items_cross_product(
...     texts, prompts, categories
... )
>>> len(items)
4

`create_filtered_categorical_items(items: list[Item], categories: list[str], prompt: str | None = None, *, item_filter: Callable[[Item], bool] | None = None, extract_text: Callable[[Item], str] | None = None, item_template_id: UUID | None = None) -> list[Item]` ¶

Create categorical items with filtering.

Parameters:

Name	Type	Description	Default
`items`	`list[Item]`	Source items.	required
`categories`	`list[str]`	Category labels for all items.	required
`prompt`	`str \| None`	The question/prompt for all items.	`None`
`item_filter`	`Callable[[Item], bool] \| None`	Filter individual items.	`None`
`extract_text`	`Callable[[Item], str] \| None`	Text extraction function.	`None`
`item_template_id`	`UUID \| None`	Template ID for created items.	`None`

Returns:

Type	Description
`list[Item]`	Filtered categorical items.

Examples:

>>> categorical_items = create_filtered_categorical_items(
...     items,
...     categories=["past", "present", "future"],
...     prompt="What is the tense?",
...     item_filter=lambda i: i.item_metadata.get("valid", True)
... )

`multi_select` ¶

Utilities for creating multi-select experimental items.

This module provides language-agnostic utilities for creating multi-select items where participants select one or more options from a set (checkboxes).

Integration Points

Active Learning: bead/active_learning/models/multi_select.py
Simulation: bead/simulation/strategies/multi_select.py
Deployment: bead/deployment/jspsych/ (checkbox plugin)

`create_multi_select_item(*options: str, min_selections: int = 1, max_selections: int | None = None, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item` ¶

Create a multi-select item from N text options.

Parameters:

Name	Type	Description	Default
`*options`	`str`	Text for each option (2 or more required).	`()`
`min_selections`	`int`	Minimum number of options that must be selected (default: 1).	`1`
`max_selections`	`int \| None`	Maximum number of options that can be selected. If None, defaults to number of options (no upper limit).	`None`
`item_template_id`	`UUID \| None`	Template ID for the item. If None, generates new UUID.	`None`
`metadata`	`dict[str, MetadataValue] \| None`	Additional metadata for item_metadata field.	`None`

Returns:

Type	Description
`Item`	Multi-select item with options stored in the options field.

Raises:

Type	Description
`ValueError`	If fewer than 2 options provided, or if min_selections > max_selections, or if min_selections < 1, or if max_selections > number of options.

Examples:

>>> item = create_multi_select_item(
...     "She walks.",
...     "She walk.",
...     "They walks.",
...     "They walk.",
...     min_selections=1,
...     max_selections=4,
...     metadata={"task": "select_grammatical"}
... )
>>> item.options[0]
'She walks.'
>>> item.item_metadata["min_selections"]
1
>>> item.item_metadata["max_selections"]
4

>>> # Multi-select with default max (all options)
>>> item = create_multi_select_item(
...     "Option A",
...     "Option B",
...     "Option C"
... )
>>> item.item_metadata["max_selections"]
3

`create_multi_select_items_from_groups(items: list[Item], group_by: Callable[[Item], Any], n_options: int | None = None, min_selections: int = 1, max_selections: int | None = None, *, extract_text: Callable[[Item], str] | None = None, include_group_metadata: bool = True, item_template_id: UUID | None = None) -> list[Item]` ¶

Create multi-select items by grouping source items.

Groups items by a property, then creates multi-select items from each group's items as options.

Parameters:

Name	Type	Description	Default
`items`	`list[Item]`	Source items to group and combine.	required
`group_by`	`Callable[[Item], Any]`	Function to extract grouping key from items.	required
`n_options`	`int \| None`	Number of options per multi-select item. If None, uses all items in each group.	`None`
`min_selections`	`int`	Minimum number of selections required (default: 1).	`1`
`max_selections`	`int \| None`	Maximum number of selections allowed. If None, defaults to n_options.	`None`
`extract_text`	`Callable[[Item], str] \| None`	Function to extract text from item. If None, tries common keys ("text", "sentence", "content") from rendered_elements.	`None`
`include_group_metadata`	`bool`	Whether to include group key in item metadata.	`True`
`item_template_id`	`UUID \| None`	Template ID for all created items. If None, generates one per item.	`None`

Returns:

Type	Description
`list[Item]`	Multi-select items created from groupings.

Examples:

Create multi-select items grouped by verb (select all acceptable frames):

>>> items = [
...     Item(
...         item_template_id=uuid4(),
...         rendered_elements={"text": "She walks."},
...         item_metadata={"verb": "walk", "frame": "intransitive"}
...     ),
...     Item(
...         item_template_id=uuid4(),
...         rendered_elements={"text": "She walks the dog."},
...         item_metadata={"verb": "walk", "frame": "transitive"}
...     ),
...     Item(
...         item_template_id=uuid4(),
...         rendered_elements={"text": "She walks to school."},
...         item_metadata={"verb": "walk", "frame": "intransitive_pp"}
...     )
... ]
>>> ms_items = create_multi_select_items_from_groups(
...     items,
...     group_by=lambda item: item.item_metadata["verb"],
...     min_selections=1,
...     max_selections=3
... )
>>> len(ms_items)
1
>>> len(ms_items[0].rendered_elements)
3

`create_multi_select_items_with_foils(correct_items: list[Item], foil_items: list[Item], n_correct: int = 2, n_foils: int = 2, *, extract_text: Callable[[Item], str] | None = None, item_template_id: UUID | None = None, metadata_fn: Callable[[list[Item], list[Item]], dict[str, MetadataValue]] | None = None) -> list[Item]` ¶

Create multi-select items by combining correct items with foils.

Useful for tasks like "Select all grammatical sentences" where some options are correct and others are foils (distractors).

Parameters:

Name	Type	Description	Default
`correct_items`	`list[Item]`	Items that are correct (should be selected).	required
`foil_items`	`list[Item]`	Items that are foils/distractors (should not be selected).	required
`n_correct`	`int`	Number of correct items to include per multi-select item (default: 2).	`2`
`n_foils`	`int`	Number of foil items to include per multi-select item (default: 2).	`2`
`extract_text`	`Callable[[Item], str] \| None`	Function to extract text from items.	`None`
`item_template_id`	`UUID \| None`	Template ID for all created items.	`None`
`metadata_fn`	`Callable[[list[Item], list[Item]], dict[str, MetadataValue]] \| None`	Function to generate metadata from (correct_items_used, foil_items_used).	`None`

Returns:

Type	Description
`list[Item]`	Multi-select items with correct items and foils.

Examples:

>>> grammatical = [
...     Item(uuid4(), rendered_elements={"text": "She walks."},
...          item_metadata={"grammatical": True}),
...     Item(uuid4(), rendered_elements={"text": "They walk."},
...          item_metadata={"grammatical": True})
... ]
>>> ungrammatical = [
...     Item(uuid4(), rendered_elements={"text": "She walk."},
...          item_metadata={"grammatical": False}),
...     Item(uuid4(), rendered_elements={"text": "They walks."},
...          item_metadata={"grammatical": False})
... ]
>>> ms_items = create_multi_select_items_with_foils(
...     grammatical,
...     ungrammatical,
...     n_correct=2,
...     n_foils=2
... )
>>> len(ms_items)
1
>>> ms_items[0].item_metadata["min_selections"]
1
>>> ms_items[0].item_metadata["max_selections"]
4

`create_multi_select_items_cross_product(group1_items: list[Item], group2_items: list[Item], n_from_group1: int = 1, n_from_group2: int = 1, min_selections: int = 1, max_selections: int | None = None, *, extract_text: Callable[[Item], str] | None = None, item_template_id: UUID | None = None, metadata_fn: Callable[[list[Item], list[Item]], dict[str, MetadataValue]] | None = None) -> list[Item]` ¶

Create multi-select items from cross-product of two groups.

Combines n items from group1 with n items from group2 to create multi-select items with (n_from_group1 + n_from_group2) options.

Parameters:

Name	Type	Description	Default
`group1_items`	`list[Item]`	Items in first group.	required
`group2_items`	`list[Item]`	Items in second group.	required
`n_from_group1`	`int`	Number of items to select from group1 per combination (default: 1).	`1`
`n_from_group2`	`int`	Number of items to select from group2 per combination (default: 1).	`1`
`min_selections`	`int`	Minimum number of selections required (default: 1).	`1`
`max_selections`	`int \| None`	Maximum number of selections allowed. If None, defaults to total options.	`None`
`extract_text`	`Callable[[Item], str] \| None`	Function to extract text from items.	`None`
`item_template_id`	`UUID \| None`	Template ID for all created items.	`None`
`metadata_fn`	`Callable[[list[Item], list[Item]], dict[str, MetadataValue]] \| None`	Function to generate metadata from (group1_items_used, group2_items_used).	`None`

Returns:

Type	Description
`list[Item]`	Multi-select items from cross-product.

Examples:

>>> active = [Item(uuid4(), rendered_elements={"text": "She walks."})]
>>> passive = [Item(uuid4(), rendered_elements={"text": "She is walked."})]
>>> ms_items = create_multi_select_items_cross_product(
...     active, passive,
...     n_from_group1=1,
...     n_from_group2=1,
...     min_selections=1,
...     max_selections=2
... )
>>> len(ms_items)
1

`create_filtered_multi_select_items(items: list[Item], group_by: Callable[[Item], Any], n_options: int | None = None, min_selections: int = 1, max_selections: int | None = None, *, item_filter: Callable[[Item], bool] | None = None, group_filter: Callable[[Any, list[Item]], bool] | None = None, combination_filter: Callable[[tuple[Item, ...]], bool] | None = None, extract_text: Callable[[Item], str] | None = None, item_template_id: UUID | None = None) -> list[Item]` ¶

Create multi-select items with multi-level filtering.

Parameters:

Name	Type	Description	Default
`items`	`list[Item]`	Source items.	required
`group_by`	`Callable[[Item], Any]`	Grouping function.	required
`n_options`	`int \| None`	Number of options per item. If None, uses all items in each group.	`None`
`min_selections`	`int`	Minimum number of selections required.	`1`
`max_selections`	`int \| None`	Maximum number of selections allowed.	`None`
`item_filter`	`Callable[[Item], bool] \| None`	Filter individual items before grouping.	`None`
`group_filter`	`Callable[[Any, list[Item]], bool] \| None`	Filter groups (receives group_key and group_items).	`None`
`combination_filter`	`Callable[[tuple[Item, ...]], bool] \| None`	Filter specific combinations.	`None`
`extract_text`	`Callable[[Item], str] \| None`	Text extraction function.	`None`
`item_template_id`	`UUID \| None`	Template ID for created items.	`None`

Returns:

Type	Description
`list[Item]`	Filtered multi-select items.

Examples:

>>> ms_items = create_filtered_multi_select_items(
...     items,
...     group_by=lambda i: i.item_metadata["verb"],
...     n_options=3,
...     item_filter=lambda i: i.item_metadata.get("valid", True),
...     group_filter=lambda key, items: len(items) >= 3,
...     min_selections=1,
...     max_selections=3
... )

`magnitude` ¶

Utilities for creating magnitude experimental items.

This module provides language-agnostic utilities for creating magnitude items where participants enter numeric values (bounded or unbounded), such as reading times, confidence ratings, or counts.

Integration Points

Active Learning: bead/active_learning/models/magnitude.py
Simulation: bead/simulation/strategies/magnitude.py
Deployment: bead/deployment/jspsych/ (numeric input)

`create_magnitude_item(text: str, unit: str | None = None, bounds: tuple[int | float | None, int | float | None] = (None, None), prompt: str | None = None, step: int | float | None = None, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item` ¶

Create a magnitude (numeric input) item.

Parameters:

Name	Type	Description	Default
`text`	`str`	The stimulus text or question.	required
`unit`	`str \| None`	Optional unit for the value (e.g., "ms", "%", "count").	`None`
`bounds`	`tuple[int \| float \| None, int \| float \| None]`	Tuple of (min, max) bounds. None means unbounded in that direction. Default: (None, None) for fully unbounded.	`(None, None)`
`prompt`	`str \| None`	Optional prompt for the numeric input. If None, uses "Enter a value:".	`None`
`step`	`int \| float \| None`	Optional step size for input validation (e.g., 1 for integers, 0.01 for hundredths).	`None`
`item_template_id`	`UUID \| None`	Template ID for the item. If None, generates new UUID.	`None`
`metadata`	`dict[str, MetadataValue] \| None`	Additional metadata for item_metadata field.	`None`

Returns:

Type	Description
`Item`	Magnitude item with text and prompt in rendered_elements.

Raises:

Type	Description
`ValueError`	If text is empty or if both bounds are provided and min >= max.

Examples:

>>> item = create_magnitude_item(
...     text="How long did it take to read this sentence?",
...     unit="ms",
...     bounds=(0, None),
...     prompt="Enter time in milliseconds:"
... )
>>> item.rendered_elements["text"]
'How long did it take to read this sentence?'
>>> item.item_metadata["unit"]
'ms'
>>> item.item_metadata["min_value"]
0
>>> item.item_metadata["max_value"] is None
True

>>> # Confidence with bounded range
>>> item = create_magnitude_item(
...     text="How confident are you in your answer?",
...     unit="%",
...     bounds=(0, 100),
...     step=1
... )
>>> item.item_metadata["max_value"]
100

`create_magnitude_items_from_texts(texts: list[str], unit: str | None = None, bounds: tuple[int | float | None, int | float | None] = (None, None), prompt: str | None = None, step: int | float | None = None, *, item_template_id: UUID | None = None, metadata_fn: Callable[[str], dict[str, MetadataValue]] | None = None) -> list[Item]` ¶

Create magnitude items from a list of texts.

Parameters:

Name	Type	Description	Default
`texts`	`list[str]`	List of stimulus texts.	required
`unit`	`str \| None`	Optional unit for all items.	`None`
`bounds`	`tuple[int \| float \| None, int \| float \| None]`	Bounds (min, max) for all items.	`(None, None)`
`prompt`	`str \| None`	The question/prompt for all items.	`None`
`step`	`int \| float \| None`	Step size for all items.	`None`
`item_template_id`	`UUID \| None`	Template ID for all created items. If None, generates one per item.	`None`
`metadata_fn`	`Callable[[str], dict[str, MetadataValue]] \| None`	Function to generate metadata from each text.	`None`

Returns:

Type	Description
`list[Item]`	Magnitude items for each text.

Examples:

>>> texts = ["Sentence 1", "Sentence 2", "Sentence 3"]
>>> items = create_magnitude_items_from_texts(
...     texts,
...     unit="ms",
...     bounds=(0, None),
...     prompt="Reading time?",
...     metadata_fn=lambda t: {"text_length": len(t)}
... )
>>> len(items)
3
>>> items[0].item_metadata["unit"]
'ms'

`create_magnitude_items_from_groups(items: list[Item], group_by: Callable[[Item], Any], unit: str | None = None, bounds: tuple[int | float | None, int | float | None] = (None, None), prompt: str | None = None, step: int | float | None = None, *, extract_text: Callable[[Item], str] | None = None, include_group_metadata: bool = True, item_template_id: UUID | None = None) -> list[Item]` ¶

Create magnitude items from grouped source items.

Groups items and creates one magnitude item per source item, preserving group information in metadata.

Parameters:

Name	Type	Description	Default
`items`	`list[Item]`	Source items to process.	required
`group_by`	`Callable[[Item], Any]`	Function to extract grouping key from items.	required
`unit`	`str \| None`	Optional unit for all items.	`None`
`bounds`	`tuple[int \| float \| None, int \| float \| None]`	Bounds (min, max) for all items.	`(None, None)`
`prompt`	`str \| None`	The question/prompt for all items.	`None`
`step`	`int \| float \| None`	Step size for all items.	`None`
`extract_text`	`Callable[[Item], str] \| None`	Function to extract text from item. If None, tries common keys.	`None`
`include_group_metadata`	`bool`	Whether to include group key in item metadata.	`True`
`item_template_id`	`UUID \| None`	Template ID for all created items. If None, generates one per item.	`None`

Returns:

Type	Description
`list[Item]`	Magnitude items from source items.

Examples:

>>> source_items = [
...     Item(
...         uuid4(),
...         rendered_elements={"text": "The cat sat."},
...         item_metadata={"category": "simple"}
...     )
... ]
>>> magnitude_items = create_magnitude_items_from_groups(
...     source_items,
...     group_by=lambda i: i.item_metadata["category"],
...     unit="ms",
...     bounds=(0, None),
...     prompt="Reading time:"
... )
>>> len(magnitude_items)
1

`create_magnitude_items_cross_product(texts: list[str], prompts: list[str], unit: str | None = None, bounds: tuple[int | float | None, int | float | None] = (None, None), step: int | float | None = None, *, item_template_id: UUID | None = None, metadata_fn: Callable[[str, str], dict[str, MetadataValue]] | None = None) -> list[Item]` ¶

Create magnitude items from cross-product of texts and prompts.

Useful when you want to apply multiple prompts to each text.

Parameters:

Name	Type	Description	Default
`texts`	`list[str]`	List of stimulus texts.	required
`prompts`	`list[str]`	List of prompts to apply.	required
`unit`	`str \| None`	Optional unit for all items.	`None`
`bounds`	`tuple[int \| float \| None, int \| float \| None]`	Bounds (min, max) for all items.	`(None, None)`
`step`	`int \| float \| None`	Step size for all items.	`None`
`item_template_id`	`UUID \| None`	Template ID for all created items.	`None`
`metadata_fn`	`Callable[[str, str], dict[str, MetadataValue]] \| None`	Function to generate metadata from (text, prompt).	`None`

Returns:

Type	Description
`list[Item]`	Magnitude items from cross-product.

Examples:

>>> texts = ["Sentence 1.", "Sentence 2."]
>>> prompts = ["Reading time?", "Processing time?"]
>>> items = create_magnitude_items_cross_product(
...     texts, prompts, unit="ms", bounds=(0, None)
... )
>>> len(items)
4

`create_filtered_magnitude_items(items: list[Item], unit: str | None = None, bounds: tuple[int | float | None, int | float | None] = (None, None), prompt: str | None = None, step: int | float | None = None, *, item_filter: Callable[[Item], bool] | None = None, extract_text: Callable[[Item], str] | None = None, item_template_id: UUID | None = None) -> list[Item]` ¶

Create magnitude items with filtering.

Parameters:

Name	Type	Description	Default
`items`	`list[Item]`	Source items.	required
`unit`	`str \| None`	Optional unit for all items.	`None`
`bounds`	`tuple[int \| float \| None, int \| float \| None]`	Bounds (min, max) for all items.	`(None, None)`
`prompt`	`str \| None`	The question/prompt for all items.	`None`
`step`	`int \| float \| None`	Step size for all items.	`None`
`item_filter`	`Callable[[Item], bool] \| None`	Filter individual items.	`None`
`extract_text`	`Callable[[Item], str] \| None`	Text extraction function.	`None`
`item_template_id`	`UUID \| None`	Template ID for created items.	`None`

Returns:

Type	Description
`list[Item]`	Filtered magnitude items.

Examples:

>>> magnitude_items = create_filtered_magnitude_items(
...     items,
...     unit="ms",
...     bounds=(0, None),
...     prompt="Reading time:",
...     item_filter=lambda i: i.item_metadata.get("valid", True)
... )

`create_reading_time_item(text: str, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item` ¶

Create a reading time measurement item.

Convenience function for reading time in milliseconds with a lower bound of 0 (no upper bound).

Parameters:

Name	Type	Description	Default
`text`	`str`	The text to measure reading time for.	required
`item_template_id`	`UUID \| None`	Template ID for the item. If None, generates new UUID.	`None`
`metadata`	`dict[str, MetadataValue] \| None`	Additional metadata for item_metadata field.	`None`

Returns:

Type	Description
`Item`	Reading time magnitude item.

Examples:

>>> item = create_reading_time_item("The cat sat on the mat.")
>>> item.item_metadata["unit"]
'ms'
>>> item.item_metadata["min_value"]
0

`create_confidence_item(text: str, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item` ¶

Create a confidence rating item (0-100%).

Convenience function for confidence percentage with bounds (0, 100).

Parameters:

Name	Type	Description	Default
`text`	`str`	The text or question to rate confidence for.	required
`item_template_id`	`UUID \| None`	Template ID for the item. If None, generates new UUID.	`None`
`metadata`	`dict[str, MetadataValue] \| None`	Additional metadata for item_metadata field.	`None`

Returns:

Type	Description
`Item`	Confidence magnitude item.

Examples:

>>> item = create_confidence_item("Is this sentence grammatical?")
>>> item.item_metadata["unit"]
'%'
>>> item.item_metadata["max_value"]
100

`free_text` ¶

Utilities for creating free text experimental items.

This module provides language-agnostic utilities for creating free text items where participants provide open-ended text responses (e.g., paraphrasing, question answering, cloze completion).

Integration Points

Active Learning: bead/active_learning/models/free_text.py
Simulation: bead/simulation/strategies/free_text.py
Deployment: bead/deployment/jspsych/ (text input or textarea)

`create_free_text_item(text: str, prompt: str, max_length: int | None = None, validation_pattern: str | None = None, min_length: int | None = None, multiline: bool = False, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item` ¶

Create a free text (open-ended) item.

Parameters:

Name	Type	Description	Default
`text`	`str`	The stimulus text or context.	required
`prompt`	`str`	The question/instruction for what to enter (required).	required
`max_length`	`int \| None`	Maximum character limit. None means unlimited.	`None`
`validation_pattern`	`str \| None`	Optional regex pattern for validation (validated at deployment).	`None`
`min_length`	`int \| None`	Minimum characters required. None means no minimum.	`None`
`multiline`	`bool`	True for textarea (multiline), False for single-line input (default).	`False`
`item_template_id`	`UUID \| None`	Template ID for the item. If None, generates new UUID.	`None`
`metadata`	`dict[str, MetadataValue] \| None`	Additional metadata for item_metadata field.	`None`

Returns:

Type	Description
`Item`	Free text item with text and prompt in rendered_elements.

Raises:

Type	Description
`ValueError`	If text or prompt is empty, or if min_length > max_length.

Examples:

>>> item = create_free_text_item(
...     text="The dog chased the cat.",
...     prompt="Who chased whom?",
...     max_length=100
... )
>>> item.rendered_elements["text"]
'The dog chased the cat.'
>>> item.rendered_elements["prompt"]
'Who chased whom?'
>>> item.item_metadata["max_length"]
100

>>> # Multiline paraphrase task
>>> item = create_free_text_item(
...     text="The quick brown fox jumps over the lazy dog.",
...     prompt="Rewrite this sentence in your own words:",
...     multiline=True,
...     max_length=200
... )
>>> item.item_metadata["multiline"]
True

`create_free_text_items_from_texts(texts: list[str], prompt: str, max_length: int | None = None, validation_pattern: str | None = None, min_length: int | None = None, multiline: bool = False, *, item_template_id: UUID | None = None, metadata_fn: Callable[[str], dict[str, MetadataValue]] | None = None) -> list[Item]` ¶

Create free text items from a list of texts with the same prompt.

Parameters:

Name	Type	Description	Default
`texts`	`list[str]`	List of stimulus texts.	required
`prompt`	`str`	The question/instruction for all items (required).	required
`max_length`	`int \| None`	Maximum character limit for all items.	`None`
`validation_pattern`	`str \| None`	Optional regex pattern for validation.	`None`
`min_length`	`int \| None`	Minimum characters required.	`None`
`multiline`	`bool`	True for textarea, False for single-line input.	`False`
`item_template_id`	`UUID \| None`	Template ID for all created items. If None, generates one per item.	`None`
`metadata_fn`	`Callable[[str], dict[str, MetadataValue]] \| None`	Function to generate metadata from each text.	`None`

Returns:

Type	Description
`list[Item]`	Free text items for each text.

Examples:

>>> texts = ["Sentence 1", "Sentence 2", "Sentence 3"]
>>> items = create_free_text_items_from_texts(
...     texts,
...     prompt="Paraphrase this:",
...     multiline=True,
...     max_length=200,
...     metadata_fn=lambda t: {"original_length": len(t)}
... )
>>> len(items)
3
>>> items[0].item_metadata["original_length"]
10

`create_free_text_items_with_context(contexts: list[str], prompts: list[str], max_length: int | None = None, validation_pattern: str | None = None, min_length: int | None = None, multiline: bool = False, *, item_template_id: UUID | None = None, metadata_fn: Callable[[str, str], dict[str, MetadataValue]] | None = None) -> list[Item]` ¶

Create free text items with context + prompt pairs.

Useful for reading comprehension, question answering where each context has a specific question.

Parameters:

Name	Type	Description	Default
`contexts`	`list[str]`	Context texts (same length as prompts).	required
`prompts`	`list[str]`	Prompts/questions for each context.	required
`max_length`	`int \| None`	Maximum character limit for all items.	`None`
`validation_pattern`	`str \| None`	Optional regex pattern for validation.	`None`
`min_length`	`int \| None`	Minimum characters required.	`None`
`multiline`	`bool`	True for textarea, False for single-line input.	`False`
`item_template_id`	`UUID \| None`	Template ID for all created items. If None, generates one per item.	`None`
`metadata_fn`	`Callable[[str, str], dict[str, MetadataValue]] \| None`	Function to generate metadata from (context, prompt).	`None`

Returns:

Type	Description
`list[Item]`	Free text items with context + prompt structure.

Raises:

Type	Description
`ValueError`	If contexts and prompts have different lengths.

Examples:

>>> contexts = ["The cat sat on the mat."]
>>> prompts = ["What sat on the mat?"]
>>> items = create_free_text_items_with_context(
...     contexts,
...     prompts,
...     max_length=50
... )
>>> len(items)
1
>>> items[0].rendered_elements["text"]
'The cat sat on the mat.'
>>> items[0].rendered_elements["prompt"]
'What sat on the mat?'

`create_free_text_items_from_groups(items: list[Item], group_by: Callable[[Item], Any], prompt: str, max_length: int | None = None, validation_pattern: str | None = None, min_length: int | None = None, multiline: bool = False, *, extract_text: Callable[[Item], str] | None = None, include_group_metadata: bool = True, item_template_id: UUID | None = None) -> list[Item]` ¶

Create free text items from grouped source items.

Groups items and creates one free text item per source item, preserving group information in metadata.

Parameters:

Name	Type	Description	Default
`items`	`list[Item]`	Source items to process.	required
`group_by`	`Callable[[Item], Any]`	Function to extract grouping key from items.	required
`prompt`	`str`	The question/instruction for all items (required).	required
`max_length`	`int \| None`	Maximum character limit.	`None`
`validation_pattern`	`str \| None`	Optional regex pattern for validation.	`None`
`min_length`	`int \| None`	Minimum characters required.	`None`
`multiline`	`bool`	True for textarea, False for single-line input.	`False`
`extract_text`	`Callable[[Item], str] \| None`	Function to extract text from item. If None, tries common keys.	`None`
`include_group_metadata`	`bool`	Whether to include group key in item metadata.	`True`
`item_template_id`	`UUID \| None`	Template ID for all created items. If None, generates one per item.	`None`

Returns:

Type	Description
`list[Item]`	Free text items from source items.

Examples:

>>> source_items = [
...     Item(
...         uuid4(),
...         rendered_elements={"text": "Sentence 1"},
...         item_metadata={"type": "simple"}
...     )
... ]
>>> free_text_items = create_free_text_items_from_groups(
...     source_items,
...     group_by=lambda i: i.item_metadata["type"],
...     prompt="Paraphrase this:",
...     multiline=True
... )
>>> len(free_text_items)
1

`create_free_text_items_cross_product(texts: list[str], prompts: list[str], max_length: int | None = None, validation_pattern: str | None = None, min_length: int | None = None, multiline: bool = False, *, item_template_id: UUID | None = None, metadata_fn: Callable[[str, str], dict[str, MetadataValue]] | None = None) -> list[Item]` ¶

Create free text items from cross-product of texts and prompts.

Useful when you want to apply multiple prompts to each text.

Parameters:

Name	Type	Description	Default
`texts`	`list[str]`	List of stimulus texts.	required
`prompts`	`list[str]`	List of prompts to apply.	required
`max_length`	`int \| None`	Maximum character limit for all items.	`None`
`validation_pattern`	`str \| None`	Optional regex pattern for validation.	`None`
`min_length`	`int \| None`	Minimum characters required.	`None`
`multiline`	`bool`	True for textarea, False for single-line input.	`False`
`item_template_id`	`UUID \| None`	Template ID for all created items.	`None`
`metadata_fn`	`Callable[[str, str], dict[str, MetadataValue]] \| None`	Function to generate metadata from (text, prompt).	`None`

Returns:

Type	Description
`list[Item]`	Free text items from cross-product.

Examples:

>>> texts = ["Sentence 1", "Sentence 2"]
>>> prompts = ["Paraphrase this:", "Summarize this:"]
>>> items = create_free_text_items_cross_product(
...     texts, prompts, multiline=True, max_length=200
... )
>>> len(items)
4

`create_filtered_free_text_items(items: list[Item], prompt: str, max_length: int | None = None, validation_pattern: str | None = None, min_length: int | None = None, multiline: bool = False, *, item_filter: Callable[[Item], bool] | None = None, extract_text: Callable[[Item], str] | None = None, item_template_id: UUID | None = None) -> list[Item]` ¶

Create free text items with filtering.

Parameters:

Name	Type	Description	Default
`items`	`list[Item]`	Source items.	required
`prompt`	`str`	The question/instruction for all items (required).	required
`max_length`	`int \| None`	Maximum character limit.	`None`
`validation_pattern`	`str \| None`	Optional regex pattern for validation.	`None`
`min_length`	`int \| None`	Minimum characters required.	`None`
`multiline`	`bool`	True for textarea, False for single-line input.	`False`
`item_filter`	`Callable[[Item], bool] \| None`	Filter individual items.	`None`
`extract_text`	`Callable[[Item], str] \| None`	Text extraction function.	`None`
`item_template_id`	`UUID \| None`	Template ID for created items.	`None`

Returns:

Type	Description
`list[Item]`	Filtered free text items.

Examples:

>>> free_text_items = create_filtered_free_text_items(
...     items,
...     prompt="Paraphrase this:",
...     multiline=True,
...     item_filter=lambda i: i.item_metadata.get("valid", True)
... )

`create_paraphrase_item(text: str, instruction: str = 'Rewrite in your own words:', item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item` ¶

Create a paraphrase generation item.

Convenience function for paraphrase tasks with multiline input.

Parameters:

Name	Type	Description	Default
`text`	`str`	The text to paraphrase.	required
`instruction`	`str`	The instruction for paraphrasing (default: "Rewrite in your own words:").	`'Rewrite in your own words:'`
`item_template_id`	`UUID \| None`	Template ID for the item. If None, generates new UUID.	`None`
`metadata`	`dict[str, MetadataValue] \| None`	Additional metadata for item_metadata field.	`None`

Returns:

Type	Description
`Item`	Paraphrase free text item.

Examples:

>>> item = create_paraphrase_item(
...     "The quick brown fox jumps over the lazy dog."
... )
>>> item.rendered_elements["prompt"]
'Rewrite in your own words:'
>>> item.item_metadata["multiline"]
True

`create_wh_question_item(text: str, question_word: str = 'Who', item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item` ¶

Create a WH-question answering item.

Convenience function for WH-question answering with short text input.

Parameters:

Name	Type	Description	Default
`text`	`str`	The context/passage for the question.	required
`question_word`	`str`	The question word to use (default: "Who").	`'Who'`
`item_template_id`	`UUID \| None`	Template ID for the item. If None, generates new UUID.	`None`
`metadata`	`dict[str, MetadataValue] \| None`	Additional metadata for item_metadata field.	`None`

Returns:

Type	Description
`Item`	WH-question free text item.

Examples:

>>> item = create_wh_question_item(
...     "The dog chased the cat.",
...     question_word="What"
... )
>>> "What" in item.rendered_elements["prompt"]
True
>>> item.item_metadata["max_length"]
100

`cloze` ¶

Utilities for creating cloze experimental items.

This module provides language-agnostic utilities for creating cloze items where participants fill in missing words/phrases in partially-filled templates.

SPECIAL: This is the ONLY task type that uses the Item.unfilled_slots field.

Cloze items are unique in that: - They use partially-filled templates with specific slots left blank - UI widgets are inferred from slot constraints at deployment time: - Extensional constraint (finite set) → dropdown - Intensional constraint (rules) → text input with validation - No constraint → free text input - Multiple slots can be unfilled in a single item

Integration Points

Active Learning: bead/active_learning/models/cloze.py
Simulation: bead/simulation/strategies/cloze.py
Deployment: bead/deployment/jspsych/ (dynamic widget generation)
Resources: bead/resources/template.py (Template and Slot models)

`create_cloze_item(template: Any, unfilled_slot_names: list[str], filled_slots: dict[str, str] | None = None, instructions: str | None = None, *, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item` ¶

Create a cloze item from a template with specific slots unfilled.

Parameters:

Name	Type	Description	Default
`template`	`Template`	Source template with slots.	required
`unfilled_slot_names`	`list[str]`	Names of slots to leave unfilled (must exist in template.slots).	required
`filled_slots`	`dict[str, str] \| None`	Pre-filled slots (keys must be valid slot names, disjoint from unfilled).	`None`
`instructions`	`str \| None`	Optional instructions for filling (e.g., "Fill in the verb").	`None`
`item_template_id`	`UUID \| None`	Template ID for the item. If None, generates new UUID.	`None`
`metadata`	`dict[str, MetadataValue] \| None`	Additional metadata for item_metadata field.	`None`

Returns:

Type	Description
`Item`	Cloze item with unfilled_slots populated.

Raises:

Type	Description
`ValueError`	If unfilled_slot_names not in template, if filled_slots not in template, if unfilled and filled overlap, if no unfilled slots, or if validation fails.

Examples:

>>> from bead.resources.template import Template, Slot
>>> template = Template(
...     name="simple",
...     template_string="{det} {noun} {verb}.",
...     slots={
...         "det": Slot(name="det"),
...         "noun": Slot(name="noun"),
...         "verb": Slot(name="verb")
...     }
... )
>>> item = create_cloze_item(
...     template,
...     unfilled_slot_names=["verb"],
...     filled_slots={"det": "The", "noun": "cat"}
... )
>>> item.rendered_elements["text"]
'The cat ___.'
>>> len(item.unfilled_slots)
1
>>> item.unfilled_slots[0].slot_name
'verb'
>>> item.unfilled_slots[0].position
2

`create_cloze_items_from_template(template: Any, n_unfilled: int = 1, strategy: str = 'all_combinations', unfilled_combinations: list[list[str]] | None = None, instructions: str | None = None, *, item_template_id: UUID | None = None, metadata_fn: Callable[[list[str]], dict[str, MetadataValue]] | None = None) -> list[Item]` ¶

Create multiple cloze items from a template, varying unfilled slots.

Parameters:

Name	Type	Description	Default
`template`	`Template`	Source template.	required
`n_unfilled`	`int`	Number of slots to leave unfilled per item (default: 1).	`1`
`strategy`	`str`	How to choose unfilled slots: - 'random': Randomly sample combinations - 'all_combinations': Generate all C(n_slots, n_unfilled) combinations - 'specified': Use provided list	`'all_combinations'`
`unfilled_combinations`	`list[list[str]] \| None`	For strategy='specified', list of slot name combinations to unfill.	`None`
`instructions`	`str \| None`	Instructions for all items.	`None`
`item_template_id`	`UUID \| None`	Template ID for all items.	`None`
`metadata_fn`	`Callable[[list[str]], dict[str, MetadataValue]] \| None`	Generate metadata from unfilled slot names.	`None`

Returns:

Type	Description
`list[Item]`	Cloze items with varying unfilled slots.

Raises:

Type	Description
`ValueError`	If n_unfilled invalid, if strategy='specified' without unfilled_combinations, or if any combination contains invalid slots.

Examples:

>>> # Generate all single-slot cloze items
>>> items = create_cloze_items_from_template(
...     template, n_unfilled=1, strategy='all_combinations'
... )
>>> len(items)  # One for each slot
3

`create_simple_cloze_item(text: str, blank_positions: list[int], blank_labels: list[str] | None = None, instructions: str | None = None, *, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item` ¶

Create a cloze item from plain text (no template).

Replaces words at specified positions with blanks. This is a simplified helper for creating cloze items without the template infrastructure.

Parameters:

Name	Type	Description	Default
`text`	`str`	Full text with no blanks.	required
`blank_positions`	`list[int]`	Word positions to blank (0-indexed).	required
`blank_labels`	`list[str] \| None`	Optional labels for blanks (for slot_name field). If None, uses generic labels like "blank_0", "blank_1".	`None`
`instructions`	`str \| None`	Optional instructions.	`None`
`item_template_id`	`UUID \| None`	Template ID for the item.	`None`
`metadata`	`dict[str, MetadataValue] \| None`	Additional metadata.	`None`

Returns:

Type	Description
`Item`	Cloze item with text-based blanks.

Raises:

Type	Description
`ValueError`	If blank_positions out of range or if blank_labels length mismatch.

Examples:

>>> item = create_simple_cloze_item(
...     text="The quick brown fox",
...     blank_positions=[1],  # "quick"
...     blank_labels=["adjective"]
... )
>>> item.rendered_elements["text"]
'The ___ brown fox'
>>> item.unfilled_slots[0].slot_name
'adjective'
>>> item.unfilled_slots[0].position
1

`create_cloze_items_from_groups(items: list[Item], group_by: Callable[[Item], Any], n_slots_to_unfill: int = 1, *, extract_text: Callable[[Item], str] | None = None, include_group_metadata: bool = True, item_template_id: UUID | None = None) -> list[Item]` ¶

Create cloze items from grouped source items.

Groups items and creates cloze items from them. If source items have template metadata, uses template-based cloze. Otherwise, falls back to simple text-based cloze.

Parameters:

Name	Type	Description	Default
`items`	`list[Item]`	Source items to group.	required
`group_by`	`Callable[[Item], Any]`	Grouping function.	required
`n_slots_to_unfill`	`int`	Number of slots/words to unfill.	`1`
`extract_text`	`Callable[[Item], str] \| None`	Text extraction function. If None, tries common keys.	`None`
`include_group_metadata`	`bool`	Whether to include group_key in metadata.	`True`
`item_template_id`	`UUID \| None`	Template ID for created items.	`None`

Returns:

Type	Description
`list[Item]`	Cloze items from grouped source items.

Examples:

>>> cloze_items = create_cloze_items_from_groups(
...     items=source_items,
...     group_by=lambda i: i.item_metadata.get("category"),
...     n_slots_to_unfill=1
... )

`create_filtered_cloze_items(templates: list[Any], n_slots_to_unfill: int = 1, *, template_filter: Callable[[Any], bool] | None = None, slot_filter: Callable[[str, Any], bool] | None = None, item_template_id: UUID | None = None) -> list[Item]` ¶

Create cloze items with multi-level filtering.

Filters templates and/or slots before creating cloze items.

Parameters:

Name	Type	Description	Default
`templates`	`list[Template]`	Source templates.	required
`n_slots_to_unfill`	`int`	Number of slots to unfill.	`1`
`template_filter`	`Callable[[Template], bool] \| None`	Filter templates.	`None`
`slot_filter`	`Callable[[str, Slot], bool] \| None`	Filter which slots can be unfilled (receives slot_name and Slot object).	`None`
`item_template_id`	`UUID \| None`	Template ID for created items.	`None`

Returns:

Type	Description
`list[Item]`	Filtered cloze items.

Examples:

>>> # Only unfill slots with constraints
>>> cloze_items = create_filtered_cloze_items(
...     templates=all_templates,
...     n_slots_to_unfill=1,
...     template_filter=lambda t: len(t.slots) >= 3,
...     slot_filter=lambda name, slot: len(slot.constraints) > 0
... )

Span Annotation Models¶

`spans` ¶

Core span annotation models.

Provides data models for labeled spans, span segments, span labels, span relations, and span specifications. Supports discontiguous spans, overlapping spans (nested and intersecting), static and interactive modes, and two label sources (fixed sets and Wikidata entity search).

`SpanSegment` ¶

Bases: BeadBaseModel

Contiguous or discontiguous indices within a single element.

Attributes:

Name	Type	Description
`element_name`	`str`	Which rendered element this segment belongs to.
`indices`	`list[int]`	Token or character indices within the element.

`validate_element_name(v: str) -> str` `classmethod` ¶

Validate element name is not empty.

Parameters:

Name	Type	Description	Default
`v`	`str`	Element name to validate.	required

Returns:

Type	Description
`str`	Validated element name.

Raises:

Type	Description
`ValueError`	If element name is empty.

`validate_indices(v: list[int]) -> list[int]` `classmethod` ¶

Validate indices are not empty and non-negative.

Parameters:

Name	Type	Description	Default
`v`	`list[int]`	Indices to validate.	required

Returns:

Type	Description
`list[int]`	Validated indices.

Raises:

Type	Description
`ValueError`	If indices are empty or contain negative values.

`SpanLabel` ¶

Bases: BeadBaseModel

Label applied to a span or relation.

Attributes:

Name	Type	Description
`label`	`str`	Human-readable label text.
`label_id`	`str \| None`	External identifier (e.g. Wikidata QID "Q5").
`confidence`	`float \| None`	Confidence score for model-assigned labels.

`validate_label(v: str) -> str` `classmethod` ¶

Validate label is not empty.

Parameters:

Name	Type	Description	Default
`v`	`str`	Label to validate.	required

Returns:

Type	Description
`str`	Validated label.

Raises:

Type	Description
`ValueError`	If label is empty.

`Span` ¶

Bases: BeadBaseModel

Labeled span across one or more elements.

Supports discontiguous, overlapping, and nested spans.

Attributes:

Name	Type	Description
`span_id`	`str`	Unique identifier within the item.
`segments`	`list[SpanSegment]`	Index segments composing this span.
`head_index`	`int \| None`	Syntactic head token index.
`label`	`SpanLabel \| None`	Label applied to this span (None = to-be-labeled).
`span_type`	`str \| None`	Semantic category (e.g. "entity", "event", "role").
`span_metadata`	`dict[str, MetadataValue]`	Additional span-specific metadata.

`validate_span_id(v: str) -> str` `classmethod` ¶

Validate span_id is not empty.

Parameters:

Name	Type	Description	Default
`v`	`str`	Span ID to validate.	required

Returns:

Type	Description
`str`	Validated span ID.

Raises:

Type	Description
`ValueError`	If span_id is empty.

`SpanRelation` ¶

Bases: BeadBaseModel

A typed, directed relation between two spans.

Used for semantic role labeling, relation extraction, entity linking, coreference, and similar tasks.

Attributes:

Name	Type	Description
`relation_id`	`str`	Unique identifier within the item.
`source_span_id`	`str`	`span_id` of the source span.
`target_span_id`	`str`	`span_id` of the target span.
`label`	`SpanLabel \| None`	Relation label (reuses SpanLabel for consistency).
`directed`	`bool`	Whether the relation is directed (A->B) or undirected (A--B).
`relation_metadata`	`dict[str, MetadataValue]`	Additional relation-specific metadata.

`validate_relation_id(v: str) -> str` `classmethod` ¶

Validate relation_id is not empty.

Parameters:

Name	Type	Description	Default
`v`	`str`	Relation ID to validate.	required

Returns:

Type	Description
`str`	Validated relation ID.

Raises:

Type	Description
`ValueError`	If relation_id is empty.

`validate_span_ids(v: str) -> str` `classmethod` ¶

Validate span IDs are not empty.

Parameters:

Name	Type	Description	Default
`v`	`str`	Span ID to validate.	required

Returns:

Type	Description
`str`	Validated span ID.

Raises:

Type	Description
`ValueError`	If span ID is empty.

`SpanSpec` ¶

Bases: BeadBaseModel

Specification for span labeling behavior.

Configures how spans are displayed, created, and labeled in an experiment. Supports both fixed label sets and Wikidata entity search for both span labels and relation labels.

Attributes:

Name	Type	Description
`index_mode`	`SpanIndexMode`	Whether spans index by token or character position.
`interaction_mode`	`SpanInteractionMode`	"static" for read-only highlights, "interactive" for participant annotation.
`label_source`	`LabelSourceType`	Source of span labels ("fixed" or "wikidata").
`labels`	`list[str] \| None`	Fixed span label set (when label_source is "fixed").
`label_colors`	`dict[str, str] \| None`	CSS colors keyed by label name.
`allow_overlapping`	`bool`	Whether overlapping spans are permitted.
`min_spans`	`int \| None`	Minimum number of spans required (interactive mode).
`max_spans`	`int \| None`	Maximum number of spans allowed (interactive mode).
`enable_relations`	`bool`	Whether relation annotation is enabled.
`relation_label_source`	`LabelSourceType`	Source of relation labels.
`relation_labels`	`list[str] \| None`	Fixed relation label set.
`relation_label_colors`	`dict[str, str] \| None`	CSS colors keyed by relation label name.
`relation_directed`	`bool`	Default directionality for new relations.
`min_relations`	`int \| None`	Minimum number of relations required (interactive mode).
`max_relations`	`int \| None`	Maximum number of relations allowed (interactive mode).
`wikidata_language`	`str`	Language for Wikidata entity search.
`wikidata_entity_types`	`list[str] \| None`	Restrict Wikidata search to these entity types.
`wikidata_result_limit`	`int`	Maximum number of Wikidata search results.

Span Labeling Utilities¶

`span_labeling` ¶

Utilities for creating span labeling experimental items.

This module provides language-agnostic utilities for creating items with span annotations. Spans can be added to any existing item type (composability) or used as standalone span labeling tasks.

Integration Points

Active Learning: bead/active_learning/ (via alignment module)
Deployment: bead/deployment/jspsych/ (span-label plugin)
Tokenization: bead/tokenization/ (display-level tokens)

`tokenize_item(item: Item, tokenizer_config: TokenizerConfig | None = None) -> Item` ¶

Tokenize an item's rendered_elements.

Populates tokenized_elements and token_space_after using the configured tokenizer. Returns a new Item (does not mutate).

Parameters:

Name	Type	Description	Default
`item`	`Item`	Item to tokenize.	required
`tokenizer_config`	`TokenizerConfig \| None`	Tokenizer configuration. If None, uses default (spaCy English).	`None`

Returns:

Type	Description
`Item`	New item with populated tokenized_elements and token_space_after.

`create_span_item(text: str, spans: list[Span], prompt: str, tokenizer_config: TokenizerConfig | None = None, tokens: list[str] | None = None, labels: list[str] | None = None, span_spec: SpanSpec | None = None, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item` ¶

Create a standalone span labeling item.

Tokenizes text using config, validates span indices against tokens.

Parameters:

Name	Type	Description	Default
`text`	`str`	The stimulus text.	required
`spans`	`list[Span]`	Pre-defined span annotations.	required
`prompt`	`str`	Question or instruction for the participant.	required
`tokenizer_config`	`TokenizerConfig \| None`	Tokenizer configuration. Ignored if `tokens` is provided.	`None`
`tokens`	`list[str] \| None`	Pre-tokenized text (overrides tokenizer).	`None`
`labels`	`list[str] \| None`	Fixed label set for span labeling.	`None`
`span_spec`	`SpanSpec \| None`	Span specification. If None, creates a default static spec.	`None`
`item_template_id`	`UUID \| None`	Template ID. If None, generates a new UUID.	`None`
`metadata`	`dict[str, MetadataValue] \| None`	Additional item metadata.	`None`

Returns:

Type	Description
`Item`	Span labeling item.

Raises:

Type	Description
`ValueError`	If text is empty or span indices are out of bounds.

`create_interactive_span_item(text: str, prompt: str, tokenizer_config: TokenizerConfig | None = None, tokens: list[str] | None = None, label_set: list[str] | None = None, label_source: LabelSourceType = 'fixed', item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item` ¶

Create an item for interactive span selection by participants.

Parameters:

Name	Type	Description	Default
`text`	`str`	The stimulus text.	required
`prompt`	`str`	Instruction for the participant.	required
`tokenizer_config`	`TokenizerConfig \| None`	Tokenizer configuration.	`None`
`tokens`	`list[str] \| None`	Pre-tokenized text (overrides tokenizer).	`None`
`label_set`	`list[str] \| None`	Fixed label set (when label_source is "fixed").	`None`
`label_source`	`LabelSourceType`	Label source type ("fixed" or "wikidata").	`'fixed'`
`item_template_id`	`UUID \| None`	Template ID. If None, generates a new UUID.	`None`
`metadata`	`dict[str, MetadataValue] \| None`	Additional item metadata.	`None`

Returns:

Type	Description
`Item`	Interactive span labeling item (no pre-defined spans).

`add_spans_to_item(item: Item, spans: list[Span], tokenizer_config: TokenizerConfig | None = None, span_spec: SpanSpec | None = None) -> Item` ¶

Add span annotations to any existing item.

This is the key composability function: any item (rating, forced choice, binary, etc.) can have spans added as an overlay. Tokenizes rendered_elements if not already tokenized. Returns a new Item.

Parameters:

Name	Type	Description	Default
`item`	`Item`	Existing item to add spans to.	required
`spans`	`list[Span]`	Span annotations to add.	required
`tokenizer_config`	`TokenizerConfig \| None`	Tokenizer configuration (used only if item lacks tokenization).	`None`
`span_spec`	`SpanSpec \| None`	Span specification.	`None`

Returns:

Type	Description
`Item`	New item with spans added.

Raises:

Type	Description
`ValueError`	If span indices are out of bounds.

`create_span_items_from_texts(texts: list[str], span_extractor: Callable[[str, list[str]], list[Span]], prompt: str, tokenizer_config: TokenizerConfig | None = None, labels: list[str] | None = None, item_template_id: UUID | None = None) -> list[Item]` ¶

Batch create span items with automatic tokenization.

Parameters:

Name	Type	Description	Default
`texts`	`list[str]`	List of stimulus texts.	required
`span_extractor`	`Callable[[str, list[str]], list[Span]]`	Function that takes (text, tokens) and returns spans.	required
`prompt`	`str`	Question or instruction for the participant.	required
`tokenizer_config`	`TokenizerConfig \| None`	Tokenizer configuration.	`None`
`labels`	`list[str] \| None`	Fixed label set.	`None`
`item_template_id`	`UUID \| None`	Shared template ID. If None, generates one per item.	`None`

Returns:

Type	Description
`list[Item]`	Span labeling items.

Item Construction¶

`constructor` ¶

Item constructor for building experimental items from templates.

This module provides the ItemConstructor class which transforms filled templates into experimental items by applying model-based constraints and collecting model outputs for analysis.

`ItemConstructor` ¶

Construct experimental items from filled templates.

Transforms filled templates into items by: 1. Resolving element references to text 2. Computing required model outputs (from constraints) 3. Evaluating constraints with model outputs 4. Creating Item instances with metadata

Parameters:

Name	Type	Description	Default
`model_registry`	`ModelAdapterRegistry`	Registry of model adapters for constraint evaluation.	required
`cache`	`ModelOutputCache`	Cache for model outputs to avoid redundant computation.	required
`constraint_resolver`	`ConstraintResolver \| None`	Resolver for evaluating non-model constraints. If None, only model-based constraints can be evaluated.	`None`

Attributes:

Name	Type	Description
`model_registry`	`ModelAdapterRegistry`	Registry of model adapters for constraint evaluation.
`cache`	`ModelOutputCache`	Cache for model outputs to avoid redundant computation.
`constraint_resolver`	`ConstraintResolver \| None`	Resolver for evaluating constraints (not used for model constraints).

Examples:

>>> from bead.items.adapters.registry import default_registry
>>> from bead.items.cache import ModelOutputCache
>>> cache = ModelOutputCache(backend="memory")
>>> constructor = ItemConstructor(default_registry, cache)
>>> constraints = {constraint_id: constraint_obj}
>>> items = list(constructor.construct_items(
...     template, filled_templates, constraints
... ))

`construct_items(item_template: ItemTemplate, filled_templates: dict[UUID, FilledTemplate], constraints: dict[UUID, Constraint]) -> Iterator[Item]` ¶

Construct items from template and filled templates.

For each combination of filled templates: 1. Render elements (resolve filled_template_ref → text) 2. Compute required model outputs (from constraints) 3. Check constraints using model outputs 4. Yield item if all constraints satisfied

Parameters:

Name	Type	Description	Default
`item_template`	`ItemTemplate`	Template defining item structure and constraints.	required
`filled_templates`	`dict[UUID, FilledTemplate]`	Map of filled template UUIDs to FilledTemplate instances.	required
`constraints`	`dict[UUID, Constraint]`	Map of constraint UUIDs to Constraint objects.	required

Yields:

Type	Description
`Item`	Constructed items that satisfy all constraints.

Raises:

Type	Description
`ValueError`	If template references missing filled templates or constraints.
`RuntimeError`	If constraint evaluation or model computation fails.

Examples:

>>> template = ItemTemplate(...)
>>> filled = {uuid1: filled1, uuid2: filled2}
>>> constraints = {c_id: constraint_obj}
>>> items = list(constructor.construct_items(
...     template, filled, constraints
... ))
>>> len(items)
2

`generation` ¶

Utilities for generating cross-product items from templates and lexicons.

This module provides language-agnostic utilities for generating items by combining templates with lexical resources in various patterns.

RELATIONSHIP TO ItemConstructor: - This module (generation.py): Generates cross-product combinations of templates × lexical items BEFORE template filling. Creates lightweight Item objects with just template_id, metadata, and unfilled information. Use when: You want to systematically explore all combinations of a lexical property (e.g., every verb in every frame).

ItemConstructor (constructor.py): Builds Items FROM ItemTemplates + FilledTemplates with constraint evaluation and model scoring. Takes filled templates and combines them into experimental items with multi-slot constraints checked. Use when: You have filled templates and want to construct experimental items with model-based constraint checking.

These modules are COMPLEMENTARY, not redundant. Typical pipeline: 1. generation.py: Generate cross-product → unfilled item specifications 2. Template filling: Fill template slots → FilledTemplates 3. constructor.py: Construct items → Items with constraints checked

`create_cross_product_items(templates: list[Template], lexicons: dict[str, Lexicon], *, cross_product_slot: str = 'verb', metadata_extractor: Callable[[Template, LexicalItem], dict[str, MetadataValue]] | None = None, filter_fn: Callable[[Template, LexicalItem], bool] | None = None) -> Iterator[Item]` ¶

Generate cross-product items from templates and lexicons.

Creates an item for each combination of template × lexical item from the specified slot's lexicon. This is useful for systematic exploration of a lexical property (e.g., every verb in every frame).

Items are generated lazily via iterator for memory efficiency with large cross-products.

Parameters:

Name	Type	Description	Default
`templates`	`list[Template]`	Templates to use for generation.	required
`lexicons`	`dict[str, Lexicon]`	Lexicons keyed by slot name.	required
`cross_product_slot`	`str`	Slot name to vary across items (default: "verb"). This slot's lexicon will be crossed with all templates.	`'verb'`
`metadata_extractor`	`Callable[[Template, LexicalItem], dict[str, MetadataValue]] \| None`	Optional function to extract metadata from template and lexical item. Receives (template, lexical_item) and returns dict for item_metadata.	`None`
`filter_fn`	`Callable[[Template, LexicalItem], bool] \| None`	Optional filter function. Receives (template, lexical_item) and returns True to include, False to skip.	`None`

Yields:

Type	Description
`Item`	Items representing template × lexical item combinations.

Examples:

Basic verb × template cross-product:

>>> from uuid import uuid4
>>> templates = [
...     Template(
...         name="transitive",
...         template_string="{subject} {verb} {object}.",
...         slots={}
...     )
... ]
>>> verb_lex = Lexicon(name="verbs")
>>> verb_lex.add(LexicalItem(lemma="walk"))
>>> verb_lex.add(LexicalItem(lemma="eat"))
>>> lexicons = {"verb": verb_lex}
>>> items = list(create_cross_product_items(templates, lexicons))
>>> len(items)
2

With metadata extraction:

>>> def extract_metadata(template, item):
...     return {
...         "verb_lemma": item.lemma,
...         "template_name": template.name,
...         "verb_pos": item.pos
...     }
>>> items = list(create_cross_product_items(
...     templates,
...     lexicons,
...     metadata_extractor=extract_metadata
... ))

With filtering:

>>> def filter_transitive_only(template, item):
...     return "transitive" in template.name
>>> items = list(create_cross_product_items(
...     templates,
...     lexicons,
...     filter_fn=filter_transitive_only
... ))

`create_filtered_cross_product_items(templates: list[Template], lexicons: dict[str, Lexicon], *, cross_product_slot: str = 'verb', template_filter: Callable[[Template], bool] | None = None, item_filter: Callable[[LexicalItem], bool] | None = None, combination_filter: Callable[[Template, LexicalItem], bool] | None = None, metadata_extractor: Callable[[Template, LexicalItem], dict[str, MetadataValue]] | None = None) -> Iterator[Item]` ¶

Generate cross-product items with multiple filter levels.

Provides separate filters for templates, lexical items, and their combinations, offering more control than the basic cross-product function.

Parameters:

Name	Type	Description	Default
`templates`	`list[Template]`	Templates to use for generation.	required
`lexicons`	`dict[str, Lexicon]`	Lexicons keyed by slot name.	required
`cross_product_slot`	`str`	Slot name to vary across items.	`'verb'`
`template_filter`	`Callable[[Template], bool] \| None`	Filter for templates (applied before cross-product).	`None`
`item_filter`	`Callable[[LexicalItem], bool] \| None`	Filter for lexical items (applied before cross-product).	`None`
`combination_filter`	`Callable[[Template, LexicalItem], bool] \| None`	Filter for combinations (applied during generation).	`None`
`metadata_extractor`	`Callable[[Template, LexicalItem], dict[str, MetadataValue]] \| None`	Metadata extraction function.	`None`

Yields:

Type	Description
`Item`	Filtered cross-product items.

Examples:

Filter at multiple levels:

>>> def template_filter(t):
...     return "transitive" in t.name
>>> def item_filter(i):
...     return i.pos == "VERB"
>>> def combination_filter(t, i):
...     # Only combine if verb is compatible with template
...     return True
>>> items = list(create_filtered_cross_product_items(
...     templates,
...     lexicons,
...     template_filter=template_filter,
...     item_filter=item_filter,
...     combination_filter=combination_filter
... ))

`create_stratified_cross_product_items(templates: list[Template], lexicons: dict[str, Lexicon], *, cross_product_slot: str = 'verb', stratify_by: Callable[[LexicalItem], str], items_per_stratum: int, metadata_extractor: Callable[[Template, LexicalItem], dict[str, MetadataValue]] | None = None) -> Iterator[Item]` ¶

Generate stratified sample of cross-product items.

Instead of full cross-product, samples a fixed number of lexical items from each stratum (defined by stratify_by function) and crosses them with all templates.

Parameters:

Name	Type	Description	Default
`templates`	`list[Template]`	Templates to use for generation.	required
`lexicons`	`dict[str, Lexicon]`	Lexicons keyed by slot name.	required
`cross_product_slot`	`str`	Slot name to vary across items.	`'verb'`
`stratify_by`	`Callable[[LexicalItem], str]`	Function to extract stratum key from lexical items.	required
`items_per_stratum`	`int`	Number of items to sample from each stratum.	required
`metadata_extractor`	`Callable[[Template, LexicalItem], dict[str, MetadataValue]] \| None`	Metadata extraction function.	`None`

Yields:

Type	Description
`Item`	Stratified cross-product items.

Examples:

Sample verbs stratified by frequency:

>>> def stratify_by_frequency(item):
...     freq = item.attributes.get("frequency", 0)
...     if freq > 1000:
...         return "high"
...     elif freq > 100:
...         return "medium"
...     else:
...         return "low"
>>> items = list(create_stratified_cross_product_items(
...     templates,
...     lexicons,
...     stratify_by=stratify_by_frequency,
...     items_per_stratum=10
... ))

`items_to_jsonl(items: Iterator[Item], output_path: str, progress_interval: int = 1000) -> int` ¶

Write iterator of items to JSONL file with progress tracking.

Utility function for efficient streaming write of large item sets.

Parameters:

Name	Type	Description	Default
`items`	`Iterator[Item]`	Items to write.	required
`output_path`	`str`	Path to output JSONL file.	required
`progress_interval`	`int`	Print progress every N items (default: 1000).	`1000`

Returns:

Type	Description
`int`	Number of items written.

Examples:

>>> items = create_cross_product_items(templates, lexicons)
>>> n = items_to_jsonl(items, "output.jsonl")
>>> print(f"Wrote {n} items")

Validation and Scoring¶

`validation` ¶

Validation utilities for constructed items.

This module provides validation functions to ensure constructed items meet all requirements and contain complete, valid data.

`validate_item(item: Item, item_template: ItemTemplate) -> list[str]` ¶

Validate a constructed item against its template.

Check that the item has all required fields, references valid templates, has consistent constraint satisfaction, and contains valid model outputs.

Parameters:

Name	Type	Description	Default
`item`	`Item`	Item to validate.	required
`item_template`	`ItemTemplate`	Template the item was constructed from.	required

Returns:

Type	Description
`list[str]`	List of validation error messages. Empty list if valid.

Examples:

>>> errors = validate_item(item, template)
>>> if errors:
...     print(f"Item is invalid: {errors}")
>>> else:
...     print("Item is valid")

`validate_model_output(output: ModelOutput) -> list[str]` ¶

Validate a model output.

Check that the model output has all required fields and valid values.

Parameters:

Name	Type	Description	Default
`output`	`ModelOutput`	Model output to validate.	required

Returns:

Type	Description
`list[str]`	List of validation error messages. Empty list if valid.

Examples:

>>> errors = validate_model_output(output)
>>> if not errors:
...     print("Model output is valid")

`validate_constraint_satisfaction(item: Item, item_template: ItemTemplate) -> list[str]` ¶

Validate constraint satisfaction consistency.

Check that all constraints in the template have been evaluated and that the results are boolean values.

Parameters:

Name	Type	Description	Default
`item`	`Item`	Item to validate.	required
`item_template`	`ItemTemplate`	Template with constraints.	required

Returns:

Type	Description
`list[str]`	List of validation error messages. Empty list if valid.

Examples:

>>> errors = validate_constraint_satisfaction(item, template)
>>> if not errors:
...     print("Constraint satisfaction is valid")

`validate_metadata_completeness(item: Item) -> list[str]` ¶

Validate that item metadata is complete.

Check that the item has all expected metadata fields populated. Since Item inherits from BeadBaseModel, id, created_at, and modified_at are always present. This function is kept for consistency and future extensibility.

Parameters:

Name	Type	Description	Default
`item`	`Item`	Item to validate.	required

Returns:

Type	Description
`list[str]`	List of validation error messages. Empty list if valid.

Examples:

>>> errors = validate_metadata_completeness(item)
>>> if not errors:
...     print("Metadata is complete")

`item_passes_all_constraints(item: Item) -> bool` ¶

Check if item satisfies all constraints.

Convenience function to check if all constraints are satisfied.

Parameters:

Name	Type	Description	Default
`item`	`Item`	Item to check.	required

Returns:

Type	Description
`bool`	True if all constraints satisfied, False otherwise.

Examples:

>>> if item_passes_all_constraints(item):
...     print("Item is valid")

`get_task_type_requirements(task_type: TaskType) -> dict[str, list[str] | str]` ¶

Get validation requirements for a task type.

Returns a dictionary describing the structural requirements for items of the specified task type. Useful for introspection, error messages, and documentation generation.

Parameters:

Name	Type	Description	Default
`task_type`	`TaskType`	Task type to get requirements for.	required

Returns:

Type	Description
`dict`	Requirements specification with keys: - required_rendered_keys: List of required rendered_elements keys - required_metadata_keys: List of required item_metadata keys - optional_metadata_keys: List of optional item_metadata keys - special_fields: List of special fields (e.g., ["unfilled_slots"]) - description: Human-readable description

Examples:

>>> reqs = get_task_type_requirements("ordinal_scale")
>>> print(reqs["required_rendered_keys"])
['text']
>>> print(reqs["required_metadata_keys"])
['scale_min', 'scale_max']

`validate_item_for_task_type(item: Item, task_type: TaskType) -> bool` ¶

Validate that an Item's structure matches requirements for a task type.

Checks that the item has the required rendered_elements keys, item_metadata keys, and special fields for the specified task type. Raises descriptive ValueError if validation fails.

Parameters:

Name	Type	Description	Default
`item`	`Item`	Item to validate.	required
`task_type`	`TaskType`	Expected task type (from bead.items.item_template.TaskType).	required

Returns:

Type	Description
`bool`	True if valid.

Raises:

Type	Description
`ValueError`	If item structure doesn't match task type requirements, with detailed explanation of what's wrong.

Examples:

>>> from bead.items.ordinal_scale import create_ordinal_scale_item
>>> item = create_ordinal_scale_item("How natural?", scale_bounds=(1, 7))
>>> validate_item_for_task_type(item, "ordinal_scale")
True

>>> from bead.items.forced_choice import create_forced_choice_item
>>> fc_item = create_forced_choice_item("A", "B")
>>> validate_item_for_task_type(fc_item, "ordinal_scale")
ValueError: ordinal_scale items must have 'text' in rendered_elements...

`infer_task_type_from_item(item: Item) -> TaskType` ¶

Infer most likely task type from Item structure.

Examines the item's rendered_elements, item_metadata, and special fields to determine which task type it matches. Uses priority order to handle ambiguous cases.

Parameters:

Name	Type	Description	Default
`item`	`Item`	Item to infer from.	required

Returns:

Type	Description
`TaskType`	Inferred task type.

Raises:

Type	Description
`ValueError`	If item structure doesn't match any task type or is ambiguous.

Examples:

>>> from bead.items.ordinal_scale import create_likert_7_item
>>> item = create_likert_7_item("How natural is this sentence?")
>>> infer_task_type_from_item(item)
'ordinal_scale'

>>> from bead.items.categorical import create_nli_item
>>> item2 = create_nli_item("All dogs bark", "Some dogs bark")
>>> infer_task_type_from_item(item2)
'categorical'

`scoring` ¶

Abstract base classes for item scoring with language models.

This module provides language-agnostic base classes for scoring items using various metrics (log probability, perplexity, embeddings).

`ItemScorer` ¶

Bases: ABC

Abstract base class for item scoring.

ItemScorer provides a framework for assigning numeric scores to items based on various criteria (language model probability, acceptability, similarity, etc.).

Examples:

Implementing a custom scorer:

>>> class AcceptabilityScorer(ItemScorer):
...     def score(self, item):
...         # Score based on some acceptability metric
...         text = item.rendered_elements.get("text", "")
...         return self._compute_acceptability(text)
...
...     def score_batch(self, items):
...         return [self.score(item) for item in items]

`score(item: Item) -> float` `abstractmethod` ¶

Compute score for a single item.

Parameters:

Name	Type	Description	Default
`item`	`Item`	Item to score.	required

Returns:

Type	Description
`float`	Numeric score for the item.

`score_batch(items: list[Item]) -> list[float]` ¶

Compute scores for multiple items.

Default implementation calls score() for each item sequentially. Subclasses can override for batch processing optimization.

Parameters:

Name	Type	Description	Default
`items`	`list[Item]`	Items to score.	required

Returns:

Type	Description
`list[float]`	Scores for each item.

Examples:

>>> scorer = ConcreteScorer()
>>> items = [item1, item2, item3]
>>> scores = scorer.score_batch(items)
>>> len(scores) == len(items)
True

`score_with_metadata(items: list[Item]) -> dict[UUID, dict[str, float | str]]` ¶

Score items and return results with metadata.

Parameters:

Name	Type	Description	Default
`items`	`list[Item]`	Items to score.	required

Returns:

Type	Description
`dict[UUID, dict[str, float \| str]]`	Dictionary mapping item UUIDs to score dictionaries. Each score dict contains at least a "score" key.

Examples:

>>> scorer = ConcreteScorer()
>>> results = scorer.score_with_metadata([item1, item2])
>>> results[item1.id]["score"]
-42.5

`LanguageModelScorer` ¶

Bases: ItemScorer

Scorer using language model log probabilities.

Scores items based on their log probability under a language model. Uses HuggingFace adapters for model inference and supports caching.

Parameters:

Name	Type	Description	Default
`model_name`	`str`	HuggingFace model identifier (e.g., "gpt2", "gpt2-medium").	required
`cache_dir`	`Path \| str \| None`	Directory for caching model outputs. If None, no caching.	`None`
`device`	`str`	Device to run model on ("cpu", "cuda", "mps").	`'cpu'`
`text_key`	`str`	Key in item.rendered_elements to use as text (default: "text").	`'text'`
`model_version`	`str`	Version string for cache tracking.	`'unknown'`

Examples:

>>> from pathlib import Path
>>> scorer = LanguageModelScorer(
...     model_name="gpt2",
...     cache_dir=Path(".cache"),
...     device="cpu"
... )
>>> score = scorer.score(item)
>>> score < 0  # Log probabilities are negative
True

`model: HuggingFaceLanguageModel` `property` ¶

Get the model, loading if necessary.

Returns:

Type	Description
`HuggingFaceLanguageModel`	The language model adapter.

`score(item: Item) -> float` ¶

Compute log probability score for an item.

Parameters:

Name	Type	Description	Default
`item`	`Item`	Item to score.	required

Returns:

Type	Description
`float`	Log probability of the item's text under the language model.

Raises:

Type	Description
`KeyError`	If text_key not found in item.rendered_elements.

`score_batch(items: list[Item], batch_size: int | None = None) -> list[float]` ¶

Compute scores for multiple items efficiently using batched inference.

Parameters:

Name	Type	Description	Default
`items`	`list[Item]`	Items to score.	required
`batch_size`	`int \| None`	Number of items to process in each batch. If None, automatically infers optimal batch size based on available resources.	`None`

Returns:

Type	Description
`list[float]`	Log probabilities for each item.

`score_with_metadata(items: list[Item]) -> dict[UUID, dict[str, float | str]]` ¶

Score items and return results with additional metrics.

Returns log probability and perplexity for each item.

Parameters:

Name	Type	Description	Default
`items`	`list[Item]`	Items to score.	required

Returns:

Type	Description
`dict[UUID, dict[str, float \| str]]`	Dictionary with "score" (log prob) and "perplexity" for each item.

`ForcedChoiceScorer` ¶

Bases: ItemScorer

Scorer for N-AFC (forced-choice) items with multiple options.

Computes comparison scores for forced-choice items by scoring each option and applying a comparison function (e.g., max difference, variance, entropy).

Parameters:

Name	Type	Description	Default
`base_scorer`	`ItemScorer`	Base scorer to use for individual options.	required
`comparison_fn`	`callable \| None`	Function that takes list of scores and returns comparison metric. Default is standard deviation (variance in scores).	`None`
`option_prefix`	`str`	Prefix for option names in rendered_elements (default: "option").	`'option'`

Examples:

>>> base = LanguageModelScorer("gpt2", device="cpu")
>>> fc_scorer = ForcedChoiceScorer(
...     base_scorer=base,
...     comparison_fn=lambda scores: max(scores) - min(scores)  # Range
... )
>>> # Item with option_a, option_b, option_c, ...
>>> score = fc_scorer.score(forced_choice_item)

`score(item: Item) -> float` ¶

Score a forced-choice item.

Extracts all options from item.rendered_elements (option_a, option_b, ...), scores each option, and applies comparison function.

Parameters:

Name	Type	Description	Default
`item`	`Item`	Forced-choice item with multiple options.	required

Returns:

Type	Description
`float`	Comparison score across all options.

Raises:

Type	Description
`ValueError`	If item doesn't contain option elements or has precomputed scores.

Model Output Cache¶

`cache` ¶

Content-addressable cache for judgment model outputs.

This module provides caching infrastructure for model outputs during item construction. It supports multiple backends (filesystem, in-memory) and various operation types including log probabilities, NLI scores, embeddings, and similarity metrics.

Note: This cache is distinct from bead.templates.adapters.cache, which handles MLM predictions for template filling. This module caches judgment model outputs used in item construction.

`CacheBackend` ¶

Bases: ABC

Abstract base class for cache backends.

Defines the interface that all cache backends must implement.

`get(key: str) -> dict[str, object] | None` `abstractmethod` ¶

Retrieve cache entry by key.

Parameters:

Name	Type	Description	Default
`key`	`str`	Cache key to retrieve.	required

Returns:

Type	Description
`dict[str, object] \| None`	Cache entry data if found, None otherwise.

`set(key: str, data: dict[str, object]) -> None` `abstractmethod` ¶

Store cache entry with key.

Parameters:

Name	Type	Description	Default
`key`	`str`	Cache key.	required
`data`	`dict[str, object]`	Cache entry data to store.	required

`delete(key: str) -> None` `abstractmethod` ¶

Delete cache entry by key.

Parameters:

Name	Type	Description	Default
`key`	`str`	Cache key to delete.	required

`clear() -> None` `abstractmethod` ¶

Clear all cache entries.

`keys() -> list[str]` `abstractmethod` ¶

Return all cache keys.

Returns:

Type	Description
`list[str]`	List of all cache keys in the backend.

`FilesystemBackend` ¶

Bases: CacheBackend

Filesystem-based cache backend.

Stores each cache entry as a separate JSON file with the cache key as the filename.

Parameters:

Name	Type	Description	Default
`cache_dir`	`Path`	Directory for cache storage.	required

Attributes:

Name	Type	Description
`cache_dir`	`Path`	Directory where cache files are stored.

Examples:

>>> from pathlib import Path
>>> backend = FilesystemBackend(cache_dir=Path(".cache"))
>>> backend.set("abc123", {"result": 42})
>>> backend.get("abc123")
{'result': 42}

`get(key: str) -> dict[str, object] | None` ¶

Retrieve cache entry from filesystem.

Parameters:

Name	Type	Description	Default
`key`	`str`	Cache key.	required

Returns:

Type	Description
`dict[str, object] \| None`	Cache entry data if found, None otherwise.

`set(key: str, data: dict[str, object]) -> None` ¶

Store cache entry to filesystem.

Parameters:

Name	Type	Description	Default
`key`	`str`	Cache key.	required
`data`	`dict[str, object]`	Cache entry data.	required

`delete(key: str) -> None` ¶

Delete cache entry from filesystem.

Parameters:

Name	Type	Description	Default
`key`	`str`	Cache key to delete.	required

`clear() -> None` ¶

Clear all cache entries from filesystem.

`keys() -> list[str]` ¶

Return all cache keys from filesystem.

Returns:

Type	Description
`list[str]`	List of cache keys (filenames without .json extension).

`InMemoryBackend` ¶

Bases: CacheBackend

In-memory cache backend.

Stores cache entries in a dictionary. No persistence across program runs. Useful for testing and temporary caching scenarios.

Examples:

>>> backend = InMemoryBackend()
>>> backend.set("xyz789", {"result": 3.14})
>>> backend.get("xyz789")
{'result': 3.14}

`get(key: str) -> dict[str, object] | None` ¶

Retrieve cache entry from memory.

Parameters:

Name	Type	Description	Default
`key`	`str`	Cache key.	required

Returns:

Type	Description
`dict[str, object] \| None`	Cache entry data if found, None otherwise.

`set(key: str, data: dict[str, object]) -> None` ¶

Store cache entry in memory.

Parameters:

Name	Type	Description	Default
`key`	`str`	Cache key.	required
`data`	`dict[str, object]`	Cache entry data.	required

`delete(key: str) -> None` ¶

Delete cache entry from memory.

Parameters:

Name	Type	Description	Default
`key`	`str`	Cache key to delete.	required

`clear() -> None` ¶

Clear all cache entries from memory.

`keys() -> list[str]` ¶

Return all cache keys from memory.

Returns:

Type	Description
`list[str]`	List of cache keys.

`ModelOutputCache` ¶

Content-addressable cache for judgment model outputs.

Caches results from various model operations to avoid redundant computation. Supports multiple operation types including log probabilities, perplexity, NLI scores, embeddings, and similarity metrics.

Cache keys are automatically generated using SHA-256 hashing of the model name, operation type, and all input parameters, ensuring deterministic cache hits for identical inputs.

Parameters:

Name	Type	Description	Default
`cache_dir`	`Path \| None`	Directory for cache files (filesystem backend only). Defaults to ~/.cache/bead/models if not specified.	`None`
`backend`	`('filesystem', 'memory')`	Cache backend type. "filesystem" persists across runs, "memory" is ephemeral.	`"filesystem"`
`enabled`	`bool`	Whether caching is enabled.	`True`

Attributes:

Name	Type	Description
`enabled`	`bool`	Whether caching is enabled. When False, all operations are no-ops.

Examples:

Basic usage with filesystem backend:

>>> from pathlib import Path
>>> cache = ModelOutputCache(cache_dir=Path(".cache"))
>>> result = cache.get("gpt2", "log_probability", text="Hello world")
>>> if result is None:
...     result = -2.5
...     cache.set("gpt2", "log_probability", result, text="Hello world")

Caching NLI scores:

>>> nli_scores = cache.get("roberta-nli", "nli",
...                        premise="Mary loves books",
...                        hypothesis="Mary enjoys reading")
>>> if nli_scores is None:
...     nli_scores = {"entailment": 0.9, "neutral": 0.08, "contradiction": 0.02}
...     cache.set("roberta-nli", "nli", nli_scores,
...              premise="Mary loves books", hypothesis="Mary enjoys reading")

Caching embeddings:

>>> import numpy as np
>>> embedding = cache.get("bert-base", "embedding", text="Hello")
>>> if embedding is None:
...     embedding = np.random.rand(768)
...     cache.set("bert-base", "embedding", embedding, text="Hello")

`generate_cache_key(model_name: str, operation: str, **inputs: str | int | float | bool | None) -> str` ¶

Generate deterministic cache key from inputs.

Parameters:

Name	Type	Description	Default
`model_name`	`str`	Model identifier.	required
`operation`	`str`	Operation type (e.g., "log_probability", "embedding").	required
`**inputs`	`str \| int \| float \| bool \| None`	Input parameters for the operation (text, premise, hypothesis).	`{}`

Returns:

Type	Description
`str`	SHA-256 hex digest as cache key.

`get(model_name: str, operation: str, **inputs: str | int | float | bool | None) -> Any` ¶

Retrieve cached result.

Parameters:

Name	Type	Description	Default
`model_name`	`str`	Model identifier.	required
`operation`	`str`	Operation type (e.g., "log_probability", "nli", "embedding").	required
`**inputs`	`str \| int \| float \| bool \| None`	Input parameters for the operation (text, premise, hypothesis).	`{}`

Returns:

Type	Description
`Any`	Cached result if found, None otherwise.

`set(model_name: str, operation: str, result: float | dict[str, float] | list[float] | np.ndarray, model_version: str | None = None, **inputs: str | int | float | bool | None) -> None` ¶

Store result in cache.

Parameters:

Name	Type	Description	Default
`model_name`	`str`	Model identifier.	required
`operation`	`str`	Operation type (e.g., "log_probability", "nli", "embedding").	required
`result`	`float \| dict[str, float] \| list[float] \| ndarray`	Result to cache (log probability, NLI scores, embedding, etc.).	required
`model_version`	`str \| None`	Optional model version string for tracking.	`None`
`**inputs`	`str \| int \| float \| bool \| None`	Input parameters for the operation (text, premise, hypothesis).	`{}`

`invalidate(model_name: str, operation: str, **inputs: str | int | float | bool | None) -> None` ¶

Invalidate specific cache entry.

Parameters:

Name	Type	Description	Default
`model_name`	`str`	Model identifier.	required
`operation`	`str`	Operation type.	required
`**inputs`	`str \| int \| float \| bool \| None`	Input parameters for the operation.	`{}`

`clear_model(model_name: str) -> None` ¶

Clear all cache entries for a specific model.

Parameters:

Name	Type	Description	Default
`model_name`	`str`	Model identifier.	required

`clear() -> None` ¶

Clear all cache entries.

Model Adapters¶

`base` ¶

Base class for model adapters used in item construction.

This module defines the abstract ModelAdapter interface that all model adapters must implement to support judgment prediction operations during Stage 3 (Item Construction).

This is SEPARATE from template filling model adapters (bead.templates.models.adapter), which are used in Stage 2.

`ModelAdapter` ¶

Bases: ABC

Base class for model adapters used in item construction.

All model adapters must implement this interface to support judgment prediction operations during Stage 3 (Item Construction).

This is SEPARATE from template filling model adapters (bead.templates.models.adapter), which are used in Stage 2.

Parameters:

Name	Type	Description	Default
`model_name`	`str`	Model identifier (e.g., "gpt2", "roberta-large-mnli").	required
`cache`	`ModelOutputCache`	Cache instance for storing model outputs.	required
`model_version`	`str`	Version of the model for cache tracking.	`'unknown'`

Attributes:

Name	Type	Description
`model_name`	`str`	Model identifier (e.g., "gpt2", "roberta-large-mnli").
`model_version`	`str`	Version of the model.
`cache`	`ModelOutputCache`	Cache for model outputs.

`compute_log_probability(text: str) -> float` `abstractmethod` ¶

Compute log probability of text under language model.

Required for language model constraints. Should raise NotImplementedError if not supported by model type.

Parameters:

Name	Type	Description	Default
`text`	`str`	Text to compute log probability for.	required

Returns:

Type	Description
`float`	Log probability of the text.

Raises:

Type	Description
`NotImplementedError`	If this operation is not supported by the model type.

`compute_perplexity(text: str) -> float` `abstractmethod` ¶

Compute perplexity of text.

Required for complexity-based filtering. Should raise NotImplementedError if not supported by model type.

Parameters:

Name	Type	Description	Default
`text`	`str`	Text to compute perplexity for.	required

Returns:

Type	Description
`float`	Perplexity of the text (must be positive).

Raises:

Type	Description
`NotImplementedError`	If this operation is not supported by the model type.

`get_embedding(text: str) -> np.ndarray[tuple[int, ...], np.dtype[np.float64]]` `abstractmethod` ¶

Get embedding vector for text.

Required for similarity computations and semantic clustering. Should raise NotImplementedError if not supported by model type.

Parameters:

Name	Type	Description	Default
`text`	`str`	Text to embed.	required

Returns:

Type	Description
`ndarray`	Embedding vector for the text.

Raises:

Type	Description
`NotImplementedError`	If this operation is not supported by the model type.

`compute_nli(premise: str, hypothesis: str) -> dict[str, float]` `abstractmethod` ¶

Compute natural language inference scores.

Must return dict with keys: "entailment", "neutral", "contradiction". Required for inference-based constraints. Should raise NotImplementedError if not supported by model type.

Parameters:

Name	Type	Description	Default
`premise`	`str`	Premise text.	required
`hypothesis`	`str`	Hypothesis text.	required

Returns:

Type	Description
`dict[str, float]`	Dictionary with keys "entailment", "neutral", "contradiction" mapping to probability scores that sum to ~1.0.

Raises:

Type	Description
`NotImplementedError`	If this operation is not supported by the model type.

`compute_similarity(text1: str, text2: str) -> float` ¶

Compute similarity between two texts.

Default implementation using cosine similarity of embeddings. Can be overridden for specialized similarity computation.

Parameters:

Name	Type	Description	Default
`text1`	`str`	First text.	required
`text2`	`str`	Second text.	required

Returns:

Type	Description
`float`	Similarity score in [-1, 1] (cosine similarity).

Raises:

Type	Description
`NotImplementedError`	If embeddings are not supported by the model type.

`get_nli_label(premise: str, hypothesis: str) -> str` ¶

Get predicted NLI label (max score).

Default implementation using argmax over compute_nli() scores.

Parameters:

Name	Type	Description	Default
`premise`	`str`	Premise text.	required
`hypothesis`	`str`	Hypothesis text.	required

Returns:

Type	Description
`str`	Predicted label: "entailment", "neutral", or "contradiction".

Raises:

Type	Description
`NotImplementedError`	If NLI is not supported by the model type.

`huggingface` ¶

HuggingFace model adapters for language models and NLI.

This module provides adapters for HuggingFace Transformers models: - HuggingFaceLanguageModel: Causal LMs (GPT-2, GPT-Neo, Llama, Mistral) - HuggingFaceMaskedLanguageModel: Masked LMs (BERT, RoBERTa, ALBERT) - HuggingFaceNLI: NLI models (RoBERTa-MNLI, DeBERTa-MNLI, BART-MNLI)

`HuggingFaceLanguageModel` ¶

Bases: HuggingFaceAdapterMixin, ModelAdapter

Adapter for HuggingFace causal language models.

Supports models like GPT-2, GPT-Neo, Llama, Mistral, and other autoregressive (left-to-right) language models.

Parameters:

Name	Type	Description	Default
`model_name`	`str`	HuggingFace model identifier (e.g., "gpt2", "gpt2-medium").	required
`cache`	`ModelOutputCache`	Cache instance for storing model outputs.	required
`device`	`('cpu', 'cuda', 'mps')`	Device to run model on. Falls back to CPU if device unavailable.	`"cpu"`
`model_version`	`str`	Version string for cache tracking.	`'unknown'`

Examples:

>>> from pathlib import Path
>>> from bead.items.cache import ModelOutputCache
>>> cache = ModelOutputCache(cache_dir=Path(".cache"))
>>> model = HuggingFaceLanguageModel("gpt2", cache, device="cpu")
>>> log_prob = model.compute_log_probability("The cat sat on the mat.")
>>> perplexity = model.compute_perplexity("The cat sat on the mat.")
>>> embedding = model.get_embedding("The cat sat on the mat.")

`model: PreTrainedModel` `property` ¶

Get the model, loading if necessary.

`tokenizer: PreTrainedTokenizerBase` `property` ¶

Get the tokenizer, loading if necessary.

`compute_log_probability(text: str) -> float` ¶

Compute log probability of text under language model.

Uses the model's loss with labels=input_ids to compute the negative log-likelihood of the text.

Parameters:

Name	Type	Description	Default
`text`	`str`	Text to compute log probability for.	required

Returns:

Type	Description
`float`	Log probability of the text.

`compute_log_probability_batch(texts: list[str], batch_size: int | None = None) -> list[float]` ¶

Compute log probabilities for multiple texts efficiently.

Uses batched tokenization and inference for significant speedup. Checks cache before computing, only processes uncached texts.

Parameters:

Name	Type	Description	Default
`texts`	`list[str]`	Texts to compute log probabilities for.	required
`batch_size`	`int \| None`	Number of texts to process in each batch. If None, automatically infers optimal batch size based on available device memory and model size.	`None`

Returns:

Type	Description
`list[float]`	Log probabilities for each text, in the same order as input.

Examples:

>>> texts = ["The cat sat.", "The dog ran.", "The bird flew."]
>>> log_probs = model.compute_log_probability_batch(texts)
>>> len(log_probs) == len(texts)
True

`compute_perplexity(text: str) -> float` ¶

Compute perplexity of text.

Perplexity is exp(average negative log-likelihood per token).

Parameters:

Name	Type	Description	Default
`text`	`str`	Text to compute perplexity for.	required

Returns:

Type	Description
`float`	Perplexity of the text (positive value).

`get_embedding(text: str) -> np.ndarray` ¶

Get embedding vector for text.

Uses mean pooling of last hidden states as the text embedding.

Parameters:

Name	Type	Description	Default
`text`	`str`	Text to embed.	required

Returns:

Type	Description
`ndarray`	Embedding vector for the text.

`compute_nli(premise: str, hypothesis: str) -> dict[str, float]` ¶

Compute natural language inference scores.

Not supported for causal language models.

Raises:

Type	Description
`NotImplementedError`	Always raised, as causal LMs don't support NLI directly.

`HuggingFaceMaskedLanguageModel` ¶

Bases: HuggingFaceAdapterMixin, ModelAdapter

Adapter for HuggingFace masked language models.

Supports models like BERT, RoBERTa, ALBERT, and other masked language models (MLMs).

Parameters:

Name	Type	Description	Default
`model_name`	`str`	HuggingFace model identifier (e.g., "bert-base-uncased").	required
`cache`	`ModelOutputCache`	Cache instance for storing model outputs.	required
`device`	`('cpu', 'cuda', 'mps')`	Device to run model on. Falls back to CPU if device unavailable.	`"cpu"`
`model_version`	`str`	Version string for cache tracking.	`'unknown'`

Examples:

>>> from pathlib import Path
>>> from bead.items.cache import ModelOutputCache
>>> cache = ModelOutputCache(cache_dir=Path(".cache"))
>>> model = HuggingFaceMaskedLanguageModel("bert-base-uncased", cache)
>>> log_prob = model.compute_log_probability("The cat sat on the mat.")
>>> embedding = model.get_embedding("The cat sat on the mat.")

`model: PreTrainedModel` `property` ¶

Get the model, loading if necessary.

`tokenizer: PreTrainedTokenizerBase` `property` ¶

Get the tokenizer, loading if necessary.

`compute_log_probability(text: str) -> float` ¶

Compute log probability of text using pseudo-log-likelihood.

For MLMs, we use pseudo-log-likelihood: mask each token one at a time and sum the log probabilities of predicting each token.

This is computationally expensive - caching is critical.

Parameters:

Name	Type	Description	Default
`text`	`str`	Text to compute log probability for.	required

Returns:

Type	Description
`float`	Pseudo-log-probability of the text.

`compute_perplexity(text: str) -> float` ¶

Compute perplexity based on pseudo-log-likelihood.

Parameters:

Name	Type	Description	Default
`text`	`str`	Text to compute perplexity for.	required

Returns:

Type	Description
`float`	Perplexity of the text (positive value).

`get_embedding(text: str) -> np.ndarray` ¶

Get embedding vector for text.

Uses the [CLS] token embedding from the last layer.

Parameters:

Name	Type	Description	Default
`text`	`str`	Text to embed.	required

Returns:

Type	Description
`ndarray`	Embedding vector for the text.

`compute_nli(premise: str, hypothesis: str) -> dict[str, float]` ¶

Compute natural language inference scores.

Not supported for masked language models.

Raises:

Type	Description
`NotImplementedError`	Always raised, as MLMs don't support NLI directly.

`HuggingFaceNLI` ¶

Bases: HuggingFaceAdapterMixin, ModelAdapter

Adapter for HuggingFace NLI models.

Supports NLI models trained on MNLI and similar datasets (e.g., "roberta-large-mnli", "microsoft/deberta-base-mnli").

Parameters:

Name	Type	Description	Default
`model_name`	`str`	HuggingFace model identifier for NLI model.	required
`cache`	`ModelOutputCache`	Cache instance for storing model outputs.	required
`device`	`('cpu', 'cuda', 'mps')`	Device to run model on. Falls back to CPU if device unavailable.	`"cpu"`
`model_version`	`str`	Version string for cache tracking.	`'unknown'`

Examples:

>>> from pathlib import Path
>>> from bead.items.cache import ModelOutputCache
>>> cache = ModelOutputCache(cache_dir=Path(".cache"))
>>> nli = HuggingFaceNLI("roberta-large-mnli", cache, device="cpu")
>>> scores = nli.compute_nli(
...     premise="Mary loves reading books.",
...     hypothesis="Mary enjoys literature."
... )
>>> label = nli.get_nli_label(
...     premise="Mary loves reading books.",
...     hypothesis="Mary enjoys literature."
... )

`model: PreTrainedModel` `property` ¶

Get the model, loading if necessary.

`tokenizer: PreTrainedTokenizerBase` `property` ¶

Get the tokenizer, loading if necessary.

`compute_log_probability(text: str) -> float` ¶

Compute log probability of text.

Not supported for NLI models.

Raises:

Type	Description
`NotImplementedError`	Always raised, as NLI models don't provide log probabilities.

`compute_perplexity(text: str) -> float` ¶

Compute perplexity of text.

Not supported for NLI models.

Raises:

Type	Description
`NotImplementedError`	Always raised, as NLI models don't provide perplexity.

`get_embedding(text: str) -> np.ndarray` ¶

Get embedding vector for text.

Uses the model's encoder to get embeddings. Note that NLI models are typically fine-tuned for classification, so embeddings may not be optimal for general similarity tasks.

Parameters:

Name	Type	Description	Default
`text`	`str`	Text to embed.	required

Returns:

Type	Description
`ndarray`	Embedding vector for the text.

`compute_nli(premise: str, hypothesis: str) -> dict[str, float]` ¶

Compute natural language inference scores.

Parameters:

Name	Type	Description	Default
`premise`	`str`	Premise text.	required
`hypothesis`	`str`	Hypothesis text.	required

Returns:

Type	Description
`dict[str, float]`	Dictionary with keys "entailment", "neutral", "contradiction" mapping to probability scores that sum to ~1.0.

`openai` ¶

OpenAI API adapter for item construction.

This module provides a ModelAdapter implementation for OpenAI's API, supporting GPT models for various NLP tasks including log probability computation, embeddings, and natural language inference via prompting.

`OpenAIAdapter` ¶

Bases: ModelAdapter

Adapter for OpenAI API models.

Provides access to OpenAI's GPT models for language model operations, embeddings, and prompted natural language inference.

Parameters:

Name	Type	Description	Default
`model_name`	`str`	OpenAI model identifier (default: "gpt-3.5-turbo").	`'gpt-3.5-turbo'`
`api_key`	`str \| None`	OpenAI API key. If None, uses OPENAI_API_KEY environment variable.	`None`
`cache`	`ModelOutputCache \| None`	Cache for model outputs. If None, creates in-memory cache.	`None`
`model_version`	`str`	Model version for cache tracking (default: "latest").	`'latest'`
`embedding_model`	`str`	Model to use for embeddings (default: "text-embedding-ada-002").	`'text-embedding-ada-002'`

Attributes:

Name	Type	Description
`model_name`	`str`	OpenAI model identifier (e.g., "gpt-3.5-turbo", "gpt-4").
`client`	`OpenAI`	OpenAI API client.
`embedding_model`	`str`	Model to use for embeddings (default: "text-embedding-ada-002").

Raises:

Type	Description
`ValueError`	If no API key is provided and OPENAI_API_KEY is not set.

`compute_log_probability(text: str) -> float` ¶

Compute log probability of text using OpenAI completions API.

Uses the completions API with logprobs to get token-level log probabilities and sums them to get the total log probability.

Parameters:

Name	Type	Description	Default
`text`	`str`	Text to compute log probability for.	required

Returns:

Type	Description
`float`	Log probability of the text (sum of token log probabilities).

`compute_perplexity(text: str) -> float` ¶

Compute perplexity of text.

Perplexity is computed as exp(-log_prob / num_tokens).

Parameters:

Name	Type	Description	Default
`text`	`str`	Text to compute perplexity for.	required

Returns:

Type	Description
`float`	Perplexity of the text (must be positive).

`get_embedding(text: str) -> np.ndarray` ¶

Get embedding vector for text using OpenAI embeddings API.

Parameters:

Name	Type	Description	Default
`text`	`str`	Text to embed.	required

Returns:

Type	Description
`ndarray`	Embedding vector for the text.

`compute_nli(premise: str, hypothesis: str) -> dict[str, float]` ¶

Compute natural language inference scores via prompting.

Uses chat completions API with a prompt to classify the relationship between premise and hypothesis.

Parameters:

Name	Type	Description	Default
`premise`	`str`	Premise text.	required
`hypothesis`	`str`	Hypothesis text.	required

Returns:

Type	Description
`dict[str, float]`	Dictionary with keys "entailment", "neutral", "contradiction" mapping to probability scores.

`anthropic` ¶

Anthropic API adapter for item construction.

This module provides a ModelAdapter implementation for Anthropic's Claude API, supporting natural language inference via prompting. Note that Claude API does not provide direct access to log probabilities or embeddings.

`AnthropicAdapter` ¶

Bases: ModelAdapter

Adapter for Anthropic Claude API models.

Provides access to Claude models for prompted natural language inference. Note that Claude API does not support log probability computation or embeddings, so those methods will raise NotImplementedError.

Parameters:

Name	Type	Description	Default
`model_name`	`str`	Claude model identifier (default: "claude-3-5-sonnet-20241022").	`'claude-3-5-sonnet-20241022'`
`api_key`	`str \| None`	Anthropic API key. If None, uses ANTHROPIC_API_KEY environment variable.	`None`
`cache`	`ModelOutputCache \| None`	Cache for model outputs. If None, creates in-memory cache.	`None`
`model_version`	`str`	Model version for cache tracking (default: "latest").	`'latest'`

Attributes:

Name	Type	Description
`model_name`	`str`	Claude model identifier (e.g., "claude-3-5-sonnet-20241022").
`client`	`Anthropic`	Anthropic API client.

Raises:

Type	Description
`ValueError`	If no API key is provided and ANTHROPIC_API_KEY is not set.

`compute_log_probability(text: str) -> float` ¶

Compute log probability of text.

Not supported by Anthropic API.

Raises:

Type	Description
`NotImplementedError`	Always raised - Claude API does not provide log probabilities.

`compute_perplexity(text: str) -> float` ¶

Compute perplexity of text.

Not supported by Anthropic API (requires log probabilities).

Raises:

Type	Description
`NotImplementedError`	Always raised - requires log probability support.

`get_embedding(text: str) -> np.ndarray` ¶

Get embedding vector for text.

Not supported by Anthropic API.

Raises:

Type	Description
`NotImplementedError`	Always raised - Claude API does not provide embeddings.

`compute_nli(premise: str, hypothesis: str) -> dict[str, float]` ¶

Compute natural language inference scores via prompting.

Uses Claude's messages API with a prompt to classify the relationship between premise and hypothesis.

Parameters:

Name	Type	Description	Default
`premise`	`str`	Premise text.	required
`hypothesis`	`str`	Hypothesis text.	required

Returns:

Type	Description
`dict[str, float]`	Dictionary with keys "entailment", "neutral", "contradiction" mapping to probability scores.

`google` ¶

Google Generative AI adapter for item construction.

This module provides a ModelAdapter implementation for Google's Generative AI models (Gemini), supporting natural language inference via prompting and embeddings. Note that Gemini API does not provide direct access to log probabilities.

`GoogleAdapter` ¶

Bases: ModelAdapter

Adapter for Google Generative AI models (Gemini).

Provides access to Gemini models for natural language inference and embeddings. Note that Gemini API does not support log probability computation.

Parameters:

Name	Type	Description	Default
`model_name`	`str`	Gemini model identifier (default: "gemini-pro").	`'gemini-pro'`
`api_key`	`str \| None`	Google API key. If None, uses GOOGLE_API_KEY environment variable.	`None`
`cache`	`ModelOutputCache \| None`	Cache for model outputs. If None, creates in-memory cache.	`None`
`model_version`	`str`	Model version for cache tracking (default: "latest").	`'latest'`
`embedding_model`	`str`	Model to use for embeddings (default: "models/embedding-001").	`'models/embedding-001'`

Attributes:

Name	Type	Description
`model_name`	`str`	Gemini model identifier (e.g., "gemini-pro").
`model`	`GenerativeModel`	Google Generative AI model instance.
`embedding_model`	`str`	Model to use for embeddings (default: "models/embedding-001").

Raises:

Type	Description
`ValueError`	If no API key is provided and GOOGLE_API_KEY is not set.

`compute_log_probability(text: str) -> float` ¶

Compute log probability of text.

Not supported by Google Generative AI API.

Raises:

Type	Description
`NotImplementedError`	Always raised - Gemini API does not provide log probabilities.

`compute_perplexity(text: str) -> float` ¶

Compute perplexity of text.

Not supported by Google Generative AI API (requires log probabilities).

Raises:

Type	Description
`NotImplementedError`	Always raised - requires log probability support.

`get_embedding(text: str) -> np.ndarray` ¶

Get embedding vector for text using Google's embedding model.

Parameters:

Name	Type	Description	Default
`text`	`str`	Text to embed.	required

Returns:

Type	Description
`ndarray`	Embedding vector for the text.

`compute_nli(premise: str, hypothesis: str) -> dict[str, float]` ¶

Compute natural language inference scores via prompting.

Uses Gemini's generation API with a prompt to classify the relationship between premise and hypothesis.

Parameters:

Name	Type	Description	Default
`premise`	`str`	Premise text.	required
`hypothesis`	`str`	Hypothesis text.	required

Returns:

Type	Description
`dict[str, float]`	Dictionary with keys "entailment", "neutral", "contradiction" mapping to probability scores.

`togetherai` ¶

Together AI adapter for item construction.

This module provides a ModelAdapter implementation for Together AI's API, which provides access to various open-source models. Together AI uses an OpenAI-compatible API, so we use the OpenAI client with a custom base URL.

`TogetherAIAdapter` ¶

Bases: ModelAdapter

Adapter for Together AI models.

Together AI provides access to various open-source models through an OpenAI-compatible API. This adapter uses the OpenAI client with a custom base URL.

Parameters:

Name	Type	Description	Default
`model_name`	`str`	Together AI model identifier (default: "meta-llama/Llama-3-70b-chat-hf").	`'meta-llama/Llama-3-70b-chat-hf'`
`api_key`	`str \| None`	Together AI API key. If None, uses TOGETHER_API_KEY environment variable.	`None`
`cache`	`ModelOutputCache \| None`	Cache for model outputs. If None, creates in-memory cache.	`None`
`model_version`	`str`	Model version for cache tracking (default: "latest").	`'latest'`

Attributes:

Name	Type	Description
`model_name`	`str`	Together AI model identifier (e.g., "meta-llama/Llama-3-70b-chat-hf").
`client`	`OpenAI`	OpenAI-compatible client configured for Together AI.

Raises:

Type	Description
`ValueError`	If no API key is provided and TOGETHER_API_KEY is not set.

`compute_log_probability(text: str) -> float` ¶

Compute log probability of text using Together AI API.

Uses the completions API with logprobs to get token-level log probabilities and sums them to get the total log probability.

Parameters:

Name	Type	Description	Default
`text`	`str`	Text to compute log probability for.	required

Returns:

Type	Description
`float`	Log probability of the text (sum of token log probabilities).

`compute_perplexity(text: str) -> float` ¶

Compute perplexity of text.

Perplexity is computed as exp(-log_prob / num_tokens).

Parameters:

Name	Type	Description	Default
`text`	`str`	Text to compute perplexity for.	required

Returns:

Type	Description
`float`	Perplexity of the text (must be positive).

Raises:

Type	Description
`NotImplementedError`	If log probability computation is not supported.

`get_embedding(text: str) -> np.ndarray` ¶

Get embedding vector for text.

Not supported by Together AI (no embedding-specific models).

Raises:

Type	Description
`NotImplementedError`	Always raised - Together AI does not provide embeddings.

`compute_nli(premise: str, hypothesis: str) -> dict[str, float]` ¶

Compute natural language inference scores via prompting.

Uses chat completions API with a prompt to classify the relationship between premise and hypothesis.

Parameters:

Name	Type	Description	Default
`premise`	`str`	Premise text.	required
`hypothesis`	`str`	Hypothesis text.	required

Returns:

Type	Description
`dict[str, float]`	Dictionary with keys "entailment", "neutral", "contradiction" mapping to probability scores.

`sentence_transformers` ¶

Sentence transformer adapter for semantic embeddings.

This module provides an adapter for sentence-transformers models, which are optimized for generating sentence embeddings for semantic similarity tasks.

`HuggingFaceSentenceTransformer` ¶

Bases: ModelAdapter

Adapter for sentence-transformers models.

Supports sentence-transformers models like "all-MiniLM-L6-v2", "all-mpnet-base-v2", etc. These models are optimized for generating sentence embeddings for semantic similarity tasks.

Parameters:

Name	Type	Description	Default
`model_name`	`str`	Sentence transformer model identifier.	required
`cache`	`ModelOutputCache`	Cache instance for storing model outputs.	required
`device`	`str \| None`	Device to run model on. If None, uses sentence-transformers default.	`None`
`model_version`	`str`	Version string for cache tracking.	`'unknown'`
`normalize_embeddings`	`bool`	Whether to normalize embeddings to unit length.	`True`

Examples:

>>> from pathlib import Path
>>> from bead.items.cache import ModelOutputCache
>>> cache = ModelOutputCache(cache_dir=Path(".cache"))
>>> model = HuggingFaceSentenceTransformer("all-MiniLM-L6-v2", cache)
>>> embedding = model.get_embedding("The cat sat on the mat.")
>>> similarity = model.compute_similarity("The cat sat.", "The dog stood.")

`model: SentenceTransformer` `property` ¶

Get the model, loading if necessary.

`compute_log_probability(text: str) -> float` ¶

Compute log probability of text.

Not supported for sentence transformer models.

Raises:

Type	Description
`NotImplementedError`	Always raised, as sentence transformers don't provide log probabilities.

`compute_perplexity(text: str) -> float` ¶

Compute perplexity of text.

Not supported for sentence transformer models.

Raises:

Type	Description
`NotImplementedError`	Always raised, as sentence transformers don't provide perplexity.

`get_embedding(text: str) -> np.ndarray` ¶

Get embedding vector for text.

Uses sentence-transformers encode() method to generate optimized sentence embeddings.

Parameters:

Name	Type	Description	Default
`text`	`str`	Text to embed.	required

Returns:

Type	Description
`ndarray`	Embedding vector for the text.

`compute_nli(premise: str, hypothesis: str) -> dict[str, float]` ¶

Compute natural language inference scores.

Not supported for sentence transformer models.

Raises:

Type	Description
`NotImplementedError`	Always raised, as sentence transformers don't support NLI directly.

`compute_similarity(text1: str, text2: str) -> float` ¶

Compute similarity between two texts.

Uses cosine similarity of embeddings. For sentence transformers, this is optimized as embeddings are already normalized (if normalize_embeddings=True).

Parameters:

Name	Type	Description	Default
`text1`	`str`	First text.	required
`text2`	`str`	Second text.	required

Returns:

Type	Description
`float`	Similarity score in [-1, 1] (cosine similarity).

`registry` ¶

Model adapter registry for centralized adapter management.

This module provides a registry for managing all model adapters, both local (HuggingFace) and API-based (OpenAI, Anthropic, etc.).

`AdapterKwargs` ¶

Bases: TypedDict

Keyword arguments for adapter initialization.

`ModelAdapterRegistry` ¶

Registry for all model adapters (local and API-based).

Provides centralized management of adapter types and instances, with automatic instance caching to avoid redundant initialization.

Attributes:

Name	Type	Description
`adapters`	`dict[str, type[ModelAdapter]]`	Registered adapter classes keyed by adapter type name.
`instances`	`dict[str, ModelAdapter]`	Cached adapter instances keyed by unique identifier.

`register(name: str, adapter_class: type[ModelAdapter]) -> None` ¶

Register an adapter class.

Parameters:

Name	Type	Description	Default
`name`	`str`	Unique name for the adapter type (e.g., "openai", "huggingface_lm").	required
`adapter_class`	`type[ModelAdapter]`	Adapter class to register (must inherit from ModelAdapter).	required

Raises:

Type	Description
`ValueError`	If adapter class does not inherit from ModelAdapter.

`get_adapter(adapter_type: str, model_name: str, **kwargs: Unpack[AdapterKwargs]) -> ModelAdapter` ¶

Get or create adapter instance (with caching).

Creates a new adapter instance if not cached, otherwise returns the cached instance. Instances are cached by adapter type and model name.

Parameters:

Name	Type	Description	Default
`adapter_type`	`str`	Type of adapter (must be registered).	required
`model_name`	`str`	Model identifier for the adapter.	required
`**kwargs`	`Unpack[AdapterKwargs]`	Additional keyword arguments to pass to adapter constructor (api_key, device, model_version, embedding_model, etc.).	`{}`

Returns:

Type	Description
`ModelAdapter`	Adapter instance (cached or newly created).

Raises:

Type	Description
`ValueError`	If adapter type is not registered.

Examples:

>>> registry = ModelAdapterRegistry()
>>> registry.register("openai", OpenAIAdapter)
>>> adapter = registry.get_adapter("openai", "gpt-4", api_key="...")

`clear_cache() -> None` ¶

Clear all cached adapter instances.

Useful for testing or when you want to force recreation of adapters with different parameters.

`list_adapters() -> list[str]` ¶

List all registered adapter types.

Returns:

Type	Description
`list[str]`	List of registered adapter type names.

`api_utils` ¶

Utilities for API-based model adapters.

This module provides shared utilities for API-based model adapters, including retry logic with exponential backoff and rate limiting.

`RateLimiter` ¶

Rate limiter for API calls.

Tracks call timestamps and enforces a maximum rate of calls per minute. Uses a sliding window algorithm to ensure the rate limit is respected.

Parameters:

Name	Type	Description	Default
`calls_per_minute`	`int`	Maximum number of calls allowed per minute (default: 60).	`60`

Attributes:

Name	Type	Description
`calls_per_minute`	`int`	Maximum number of calls allowed per minute.
`call_times`	`list[float]`	Timestamps of recent API calls.

`wait_if_needed() -> None` ¶

Wait if rate limit would be exceeded.

Checks if making a call now would exceed the rate limit. If so, sleeps until enough time has passed.

`retry_with_backoff(max_retries: int = 3, initial_delay: float = 1.0, backoff_factor: float = 2.0, exceptions: tuple[type[Exception], ...] = (Exception,)) -> Callable[[Callable[..., T]], Callable[..., T]]` ¶

Decorate function with retry logic and exponential backoff.

Retries a function call on specified exceptions with exponential backoff between attempts. The delay between retries grows exponentially: delay = initial_delay * (backoff_factor ** attempt).

Parameters:

Name	Type	Description	Default
`max_retries`	`int`	Maximum number of retry attempts (default: 3).	`3`
`initial_delay`	`float`	Initial delay in seconds before first retry (default: 1.0).	`1.0`
`backoff_factor`	`float`	Multiplicative factor for delay between retries (default: 2.0).	`2.0`
`exceptions`	`tuple[type[Exception], ...]`	Tuple of exception types to catch and retry on (default: (Exception,)).	`(Exception,)`

Returns:

Type	Description
`Callable`	Decorated function with retry logic.

Examples:

>>> @retry_with_backoff(max_retries=3, initial_delay=1.0)
... def call_api():
...     # May raise transient errors
...     return api.get_data()

`rate_limit(calls_per_minute: int = 60) -> Callable[[Callable[P, T]], Callable[P, T]]` ¶

Decorate function with rate limiting for API calls.

Enforces a maximum rate of API calls per minute using a shared RateLimiter instance. Calls that would exceed the rate limit will block until the limit resets.

Parameters:

Name	Type	Description	Default
`calls_per_minute`	`int`	Maximum number of calls allowed per minute (default: 60).	`60`

Returns:

Type	Description
`Callable`	Decorated function with rate limiting.

Examples:

>>> @rate_limit(calls_per_minute=30)
... def call_api():
...     return api.get_data()

bead.items¶

Core Classes¶

item ¶

UnfilledSlot ¶

validate_slot_name(v: str) -> str classmethod ¶

ModelOutput ¶

validate_non_empty_strings(v: str) -> str classmethod ¶

Item ¶

validate_span_relations() -> Item ¶

get_model_output(model_name: str, operation: str, inputs: dict[str, MetadataValue] | None = None) -> ModelOutput | None ¶

add_model_output(output: ModelOutput) -> None ¶

ItemCollection ¶

validate_name(v: str) -> str classmethod ¶

add_item(item: Item) -> None ¶

item_template ¶

ChunkingSpec ¶

TimingParams ¶

TaskSpec ¶

validate_prompt(v: str) -> str classmethod ¶

PresentationSpec ¶

ItemElement ¶

is_text: bool property ¶

is_template_ref: bool property ¶

validate_element_name(v: str) -> str classmethod ¶

ItemTemplate ¶

validate_name(v: str) -> str classmethod ¶

validate_unique_element_names(v: list[ItemElement]) -> list[ItemElement] classmethod ¶

validate_presentation_order(v: list[str] | None, info: ValidationInfo) -> list[str] | None classmethod ¶

get_element_by_name(name: str) -> ItemElement | None ¶

get_template_ref_elements() -> list[ItemElement] ¶

ItemTemplateCollection ¶

validate_name(v: str) -> str classmethod ¶

add_template(template: ItemTemplate) -> None ¶

Task-Type Utilities¶

forced_choice ¶

create_forced_choice_item(*options: str, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item ¶

create_forced_choice_items_from_groups(items: list[Item], group_by: Callable[[Item], Any], n_alternatives: int = 2, *, extract_text: Callable[[Item], str] | None = None, include_group_metadata: bool = True, item_template_id: UUID | None = None) -> list[Item] ¶

ordinal_scale ¶

create_ordinal_scale_item(text: str, scale_bounds: tuple[int, int] = (1, 7), prompt: str | None = None, scale_labels: dict[int, str] | None = None, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item ¶

create_ordinal_scale_items_from_texts(texts: list[str], scale_bounds: tuple[int, int] = (1, 7), prompt: str | None = None, scale_labels: dict[int, str] | None = None, *, item_template_id: UUID | None = None, metadata_fn: Callable[[str], dict[str, MetadataValue]] | None = None) -> list[Item] ¶

create_ordinal_scale_items_cross_product(texts: list[str], prompts: list[str], scale_bounds: tuple[int, int] = (1, 7), scale_labels: dict[int, str] | None = None, *, item_template_id: UUID | None = None, metadata_fn: Callable[[str, str], dict[str, MetadataValue]] | None = None) -> list[Item] ¶

create_likert_5_item(text: str, prompt: str | None = None, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item ¶

create_likert_7_item(text: str, prompt: str | None = None, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item ¶

binary ¶

create_binary_item(text: str, prompt: str = 'Yes/No?', binary_options: tuple[str, str] = ('yes', 'no'), item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item ¶

create_binary_items_from_texts(texts: list[str], prompt: str, binary_options: tuple[str, str] = ('yes', 'no'), *, item_template_id: UUID | None = None, metadata_fn: Callable[[str], dict[str, MetadataValue]] | None = None) -> list[Item] ¶

create_binary_items_from_groups(items: list[Item], group_by: Callable[[Item], Hashable], prompt: str, binary_options: tuple[str, str] = ('yes', 'no'), *, extract_text: Callable[[Item], str] | None = None, include_group_metadata: bool = True, item_template_id: UUID | None = None) -> list[Item] ¶

create_binary_items_cross_product(texts: list[str], prompts: list[str], binary_options: tuple[str, str] = ('yes', 'no'), *, item_template_id: UUID | None = None, metadata_fn: Callable[[str, str], dict[str, MetadataValue]] | None = None) -> list[Item] ¶

create_filtered_binary_items(items: list[Item], prompt: str, binary_options: tuple[str, str] = ('yes', 'no'), *, item_filter: Callable[[Item], bool] | None = None, extract_text: Callable[[Item], str] | None = None, item_template_id: UUID | None = None) -> list[Item] ¶

categorical ¶

create_categorical_item(text: str, categories: list[str], prompt: str | None = None, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item ¶

create_nli_item(premise: str, hypothesis: str, categories: list[str] | None = None, prompt: str | None = None, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item ¶

create_categorical_items_from_texts(texts: list[str], categories: list[str], prompt: str | None = None, *, item_template_id: UUID | None = None, metadata_fn: Callable[[str], dict[str, MetadataValue]] | None = None) -> list[Item] ¶

create_categorical_items_from_groups(items: list[Item], group_by: Callable[[Item], Hashable], categories: list[str], prompt: str | None = None, *, extract_text: Callable[[Item], str] | None = None, include_group_metadata: bool = True, item_template_id: UUID | None = None) -> list[Item] ¶

create_categorical_items_cross_product(texts: list[str], prompts: list[str], categories: list[str], *, item_template_id: UUID | None = None, metadata_fn: Callable[[str, str], dict[str, MetadataValue]] | None = None) -> list[Item] ¶

create_filtered_categorical_items(items: list[Item], categories: list[str], prompt: str | None = None, *, item_filter: Callable[[Item], bool] | None = None, extract_text: Callable[[Item], str] | None = None, item_template_id: UUID | None = None) -> list[Item] ¶

multi_select ¶

create_multi_select_item(*options: str, min_selections: int = 1, max_selections: int | None = None, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item ¶

magnitude ¶

create_magnitude_item(text: str, unit: str | None = None, bounds: tuple[int | float | None, int | float | None] = (None, None), prompt: str | None = None, step: int | float | None = None, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item ¶

create_reading_time_item(text: str, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item ¶

create_confidence_item(text: str, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item ¶

free_text ¶

create_free_text_item(text: str, prompt: str, max_length: int | None = None, validation_pattern: str | None = None, min_length: int | None = None, multiline: bool = False, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item ¶

create_paraphrase_item(text: str, instruction: str = 'Rewrite in your own words:', item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item ¶

create_wh_question_item(text: str, question_word: str = 'Who', item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item ¶

cloze ¶

create_cloze_item(template: Any, unfilled_slot_names: list[str], filled_slots: dict[str, str] | None = None, instructions: str | None = None, *, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item ¶

create_simple_cloze_item(text: str, blank_positions: list[int], blank_labels: list[str] | None = None, instructions: str | None = None, *, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item ¶

create_cloze_items_from_groups(items: list[Item], group_by: Callable[[Item], Any], n_slots_to_unfill: int = 1, *, extract_text: Callable[[Item], str] | None = None, include_group_metadata: bool = True, item_template_id: UUID | None = None) -> list[Item] ¶

create_filtered_cloze_items(templates: list[Any], n_slots_to_unfill: int = 1, *, template_filter: Callable[[Any], bool] | None = None, slot_filter: Callable[[str, Any], bool] | None = None, item_template_id: UUID | None = None) -> list[Item] ¶

Span Annotation Models¶

spans ¶

SpanSegment ¶

validate_element_name(v: str) -> str classmethod ¶

validate_indices(v: list[int]) -> list[int] classmethod ¶

SpanLabel ¶

validate_label(v: str) -> str classmethod ¶

Span ¶

validate_span_id(v: str) -> str classmethod ¶

`item` ¶

`UnfilledSlot` ¶

`validate_slot_name(v: str) -> str` `classmethod` ¶

`ModelOutput` ¶

`validate_non_empty_strings(v: str) -> str` `classmethod` ¶

`Item` ¶

`validate_span_relations() -> Item` ¶

`get_model_output(model_name: str, operation: str, inputs: dict[str, MetadataValue] | None = None) -> ModelOutput | None` ¶

`add_model_output(output: ModelOutput) -> None` ¶

`ItemCollection` ¶

`validate_name(v: str) -> str` `classmethod` ¶

`add_item(item: Item) -> None` ¶

`item_template` ¶

`ChunkingSpec` ¶

`TimingParams` ¶

`TaskSpec` ¶

`validate_prompt(v: str) -> str` `classmethod` ¶

`PresentationSpec` ¶

`ItemElement` ¶

`is_text: bool` `property` ¶

`is_template_ref: bool` `property` ¶

`validate_element_name(v: str) -> str` `classmethod` ¶

`ItemTemplate` ¶

`validate_name(v: str) -> str` `classmethod` ¶

`validate_unique_element_names(v: list[ItemElement]) -> list[ItemElement]` `classmethod` ¶

`validate_presentation_order(v: list[str] | None, info: ValidationInfo) -> list[str] | None` `classmethod` ¶

`get_element_by_name(name: str) -> ItemElement | None` ¶

`get_template_ref_elements() -> list[ItemElement]` ¶

`ItemTemplateCollection` ¶

`validate_name(v: str) -> str` `classmethod` ¶

`add_template(template: ItemTemplate) -> None` ¶

`forced_choice` ¶

`create_forced_choice_item(*options: str, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item` ¶

`create_forced_choice_items_from_groups(items: list[Item], group_by: Callable[[Item], Any], n_alternatives: int = 2, *, extract_text: Callable[[Item], str] | None = None, include_group_metadata: bool = True, item_template_id: UUID | None = None) -> list[Item]` ¶

`ordinal_scale` ¶

`create_ordinal_scale_item(text: str, scale_bounds: tuple[int, int] = (1, 7), prompt: str | None = None, scale_labels: dict[int, str] | None = None, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item` ¶

`create_ordinal_scale_items_from_texts(texts: list[str], scale_bounds: tuple[int, int] = (1, 7), prompt: str | None = None, scale_labels: dict[int, str] | None = None, *, item_template_id: UUID | None = None, metadata_fn: Callable[[str], dict[str, MetadataValue]] | None = None) -> list[Item]` ¶

`create_ordinal_scale_items_cross_product(texts: list[str], prompts: list[str], scale_bounds: tuple[int, int] = (1, 7), scale_labels: dict[int, str] | None = None, *, item_template_id: UUID | None = None, metadata_fn: Callable[[str, str], dict[str, MetadataValue]] | None = None) -> list[Item]` ¶

`create_likert_5_item(text: str, prompt: str | None = None, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item` ¶

`create_likert_7_item(text: str, prompt: str | None = None, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item` ¶

`binary` ¶

`create_binary_item(text: str, prompt: str = 'Yes/No?', binary_options: tuple[str, str] = ('yes', 'no'), item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item` ¶

`create_binary_items_from_texts(texts: list[str], prompt: str, binary_options: tuple[str, str] = ('yes', 'no'), *, item_template_id: UUID | None = None, metadata_fn: Callable[[str], dict[str, MetadataValue]] | None = None) -> list[Item]` ¶

`create_binary_items_from_groups(items: list[Item], group_by: Callable[[Item], Hashable], prompt: str, binary_options: tuple[str, str] = ('yes', 'no'), *, extract_text: Callable[[Item], str] | None = None, include_group_metadata: bool = True, item_template_id: UUID | None = None) -> list[Item]` ¶

`create_binary_items_cross_product(texts: list[str], prompts: list[str], binary_options: tuple[str, str] = ('yes', 'no'), *, item_template_id: UUID | None = None, metadata_fn: Callable[[str, str], dict[str, MetadataValue]] | None = None) -> list[Item]` ¶

`create_filtered_binary_items(items: list[Item], prompt: str, binary_options: tuple[str, str] = ('yes', 'no'), *, item_filter: Callable[[Item], bool] | None = None, extract_text: Callable[[Item], str] | None = None, item_template_id: UUID | None = None) -> list[Item]` ¶

`categorical` ¶

`create_categorical_item(text: str, categories: list[str], prompt: str | None = None, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item` ¶

`create_nli_item(premise: str, hypothesis: str, categories: list[str] | None = None, prompt: str | None = None, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item` ¶

`create_categorical_items_from_texts(texts: list[str], categories: list[str], prompt: str | None = None, *, item_template_id: UUID | None = None, metadata_fn: Callable[[str], dict[str, MetadataValue]] | None = None) -> list[Item]` ¶

`create_categorical_items_from_groups(items: list[Item], group_by: Callable[[Item], Hashable], categories: list[str], prompt: str | None = None, *, extract_text: Callable[[Item], str] | None = None, include_group_metadata: bool = True, item_template_id: UUID | None = None) -> list[Item]` ¶

`create_categorical_items_cross_product(texts: list[str], prompts: list[str], categories: list[str], *, item_template_id: UUID | None = None, metadata_fn: Callable[[str, str], dict[str, MetadataValue]] | None = None) -> list[Item]` ¶

`create_filtered_categorical_items(items: list[Item], categories: list[str], prompt: str | None = None, *, item_filter: Callable[[Item], bool] | None = None, extract_text: Callable[[Item], str] | None = None, item_template_id: UUID | None = None) -> list[Item]` ¶

`multi_select` ¶

`create_multi_select_item(*options: str, min_selections: int = 1, max_selections: int | None = None, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item` ¶

`magnitude` ¶

`create_magnitude_item(text: str, unit: str | None = None, bounds: tuple[int | float | None, int | float | None] = (None, None), prompt: str | None = None, step: int | float | None = None, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item` ¶

`create_reading_time_item(text: str, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item` ¶

`create_confidence_item(text: str, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item` ¶

`free_text` ¶

`create_free_text_item(text: str, prompt: str, max_length: int | None = None, validation_pattern: str | None = None, min_length: int | None = None, multiline: bool = False, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item` ¶

`create_paraphrase_item(text: str, instruction: str = 'Rewrite in your own words:', item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item` ¶

`create_wh_question_item(text: str, question_word: str = 'Who', item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item` ¶

`cloze` ¶

`create_cloze_item(template: Any, unfilled_slot_names: list[str], filled_slots: dict[str, str] | None = None, instructions: str | None = None, *, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item` ¶

`create_simple_cloze_item(text: str, blank_positions: list[int], blank_labels: list[str] | None = None, instructions: str | None = None, *, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item` ¶

`create_cloze_items_from_groups(items: list[Item], group_by: Callable[[Item], Any], n_slots_to_unfill: int = 1, *, extract_text: Callable[[Item], str] | None = None, include_group_metadata: bool = True, item_template_id: UUID | None = None) -> list[Item]` ¶

`create_filtered_cloze_items(templates: list[Any], n_slots_to_unfill: int = 1, *, template_filter: Callable[[Any], bool] | None = None, slot_filter: Callable[[str, Any], bool] | None = None, item_template_id: UUID | None = None) -> list[Item]` ¶

`spans` ¶

`SpanSegment` ¶

`validate_element_name(v: str) -> str` `classmethod` ¶

`validate_indices(v: list[int]) -> list[int]` `classmethod` ¶

`SpanLabel` ¶

`validate_label(v: str) -> str` `classmethod` ¶

`Span` ¶

`validate_span_id(v: str) -> str` `classmethod` ¶

`SpanRelation` ¶

`validate_relation_id(v: str) -> str` `classmethod` ¶

`validate_span_ids(v: str) -> str` `classmethod` ¶

`SpanSpec` ¶