bead.items¶
Stage 3 of the bead pipeline: experimental item construction with 9 task types.
Core Classes¶
item
¶
Data models for constructed experimental items.
UnfilledSlot
¶
Bases: BeadBaseModel
An unfilled slot in a cloze task item.
Represents a slot in a partially filled template where the participant must provide a response. The UI widget for collecting the response is inferred from the slot's constraints at deployment time.
Attributes:
| Name | Type | Description |
|---|---|---|
slot_name |
str
|
Name of the unfilled template slot. |
position |
int
|
Token index position in the rendered text. |
constraint_ids |
list[UUID]
|
UUIDs of constraints that apply to this slot. |
Examples:
>>> from uuid import UUID
>>> # Extensional constraint slot (will render as dropdown)
>>> UnfilledSlot(
... slot_name="determiner",
... position=0,
... constraint_ids=[UUID("12345678-1234-5678-1234-567812345678")]
... )
>>> # Unconstrained slot (will render as text input)
>>> UnfilledSlot(
... slot_name="adjective",
... position=2,
... constraint_ids=[]
... )
validate_slot_name(v: str) -> str
classmethod
¶
Validate slot name is not empty.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
v
|
str
|
Slot name to validate. |
required |
Returns:
| Type | Description |
|---|---|
str
|
Validated slot name. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If slot name is empty or contains only whitespace. |
ModelOutput
¶
Bases: BeadBaseModel
Output from a model computation.
Attributes:
| Name | Type | Description |
|---|---|---|
model_name |
str
|
Name/identifier of the model. |
model_version |
str
|
Version of the model. |
operation |
str
|
Operation performed (e.g., "log_probability", "nli", "embedding"). |
inputs |
dict[str, MetadataValue]
|
Inputs to the model. |
output |
MetadataValue
|
Model output. |
cache_key |
str
|
Cache key for this computation. |
computation_metadata |
dict[str, MetadataValue]
|
Metadata about the computation (timestamp, device, etc.). |
Examples:
>>> output = ModelOutput(
... model_name="gpt2",
... model_version="latest",
... operation="log_probability",
... inputs={"text": "The cat broke the vase"},
... output=-12.4,
... cache_key="abc123..."
... )
validate_non_empty_strings(v: str) -> str
classmethod
¶
Validate required string fields are not empty.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
v
|
str
|
String value to validate. |
required |
Returns:
| Type | Description |
|---|---|
str
|
Validated string. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If string is empty or contains only whitespace. |
Item
¶
Bases: BeadBaseModel
A constructed experimental item.
Items are discrete stimuli presented to participants or models for judgment collection. They are constructed from item templates and filled templates.
Attributes:
| Name | Type | Description |
|---|---|---|
item_template_id |
UUID
|
UUID of the item template this was constructed from. |
filled_template_refs |
list[UUID]
|
UUIDs of filled templates used in this item. |
rendered_elements |
dict[str, str]
|
Rendered text for each element (by element_name). |
options |
list[str]
|
Choice options for forced_choice/multi_select tasks. Each string is one option text. Order matters (first option is displayed first). |
unfilled_slots |
list[UnfilledSlot]
|
Unfilled slots for cloze tasks (UI widgets inferred from constraints). |
model_outputs |
list[ModelOutput]
|
All model computations for this item. |
constraint_satisfaction |
dict[UUID, bool]
|
Constraint UUIDs mapped to satisfaction status. |
item_metadata |
dict[str, MetadataValue]
|
Additional metadata for this item. |
spans |
list[Span]
|
Span annotations for this item (default: empty). |
span_relations |
list[SpanRelation]
|
Relations between spans, directed or undirected (default: empty). |
tokenized_elements |
dict[str, list[str]]
|
Tokenized text for span indexing, keyed by element name (default: empty). |
token_space_after |
dict[str, list[bool]]
|
Per-token space_after flags for artifact-free rendering (default: empty). |
Examples:
>>> # Simple item
>>> item = Item(
... item_template_id=UUID("..."),
... filled_template_refs=[UUID("...")],
... rendered_elements={"sentence": "The cat broke the vase"}
... )
>>> # Forced-choice item with options
>>> fc_item = Item(
... item_template_id=UUID("..."),
... options=["The cat sat on the mat.", "The cats sat on the mat."],
... item_metadata={"n_options": 2}
... )
>>> # Cloze item with unfilled slots
>>> cloze_item = Item(
... item_template_id=UUID("..."),
... rendered_elements={"sentence": "The ___ cat ___ the ___"},
... unfilled_slots=[
... UnfilledSlot(slot_name="determiner", position=0, constraint_ids=[...]),
... UnfilledSlot(slot_name="verb", position=2, constraint_ids=[...])
... ]
... )
validate_span_relations() -> Item
¶
Validate all span_relations reference valid span_ids from spans.
Returns:
| Type | Description |
|---|---|
Item
|
Validated item. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If a relation references a span_id not present in spans. |
get_model_output(model_name: str, operation: str, inputs: dict[str, MetadataValue] | None = None) -> ModelOutput | None
¶
Get a specific model output.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
str
|
Name of the model. |
required |
operation
|
str
|
Operation type. |
required |
inputs
|
dict[str, MetadataValue] | None
|
Optional input filter. |
None
|
Returns:
| Type | Description |
|---|---|
ModelOutput | None
|
The model output if found, None otherwise. |
Examples:
add_model_output(output: ModelOutput) -> None
¶
Add a model output to this item.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
output
|
ModelOutput
|
Model output to add. |
required |
Examples:
ItemCollection
¶
Bases: BeadBaseModel
A collection of constructed items.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
Name of this collection. |
source_template_collection_id |
UUID
|
UUID of the source item template collection. |
source_filled_collection_id |
UUID
|
UUID of the source filled template collection. |
items |
list[Item]
|
The constructed items. |
construction_stats |
dict[str, int]
|
Statistics about item construction. |
Examples:
>>> collection = ItemCollection(
... name="acceptability_items",
... source_template_collection_id=UUID("..."),
... source_filled_collection_id=UUID("...")
... )
>>> collection.add_item(item)
validate_name(v: str) -> str
classmethod
¶
Validate collection name is not empty.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
v
|
str
|
Collection name to validate. |
required |
Returns:
| Type | Description |
|---|---|
str
|
Validated collection name. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If name is empty or contains only whitespace. |
item_template
¶
Data models for experimental item templates.
ChunkingSpec
¶
Bases: BeadBaseModel
Specification for text segmentation in incremental presentation.
Defines how to segment text for self-paced reading or timed sequence presentation. Supports character-level, word-level, sentence-level, constituent-based (with parsing), or custom boundary segmentation.
Attributes:
| Name | Type | Description |
|---|---|---|
unit |
ChunkingUnit
|
Segmentation unit type. Defaults to "word". |
parse_type |
ParseType | None
|
Type of parsing for constituent chunking ("constituency" or "dependency"). |
constituent_labels |
list[str] | None
|
Labels for constituent chunking. For constituency parsing, these are constituent types (e.g., ["NP", "VP", "S"]). For dependency parsing, these are dependency relations (e.g., ["nsubj", "dobj", "root"]). |
parser |
Literal['stanza', 'spacy'] | None
|
Parser library to use for constituent chunking. |
parse_language |
str | None
|
ISO 639 language code for parser (e.g., "en", "es", "zh"). |
custom_boundaries |
list[int] | None
|
Token indices for custom chunking boundaries. |
Examples:
>>> # Word-by-word chunking
>>> ChunkingSpec(unit="word")
>>> # Chunk by noun phrases (constituency)
>>> ChunkingSpec(
... unit="constituent",
... parse_type="constituency",
... constituent_labels=["NP"],
... parser="stanza",
... parse_language="en"
... )
>>> # Chunk by subjects and objects (dependency)
>>> ChunkingSpec(
... unit="constituent",
... parse_type="dependency",
... constituent_labels=["nsubj", "dobj"],
... parser="spacy",
... parse_language="en"
... )
>>> # Custom boundaries at specific token positions
>>> ChunkingSpec(unit="custom", custom_boundaries=[0, 3, 7, 10])
TimingParams
¶
Bases: BeadBaseModel
Timing parameters for stimulus presentation.
Defines timing constraints for timed sequence presentations, including per-chunk duration, inter-stimulus intervals, and response timeouts.
Attributes:
| Name | Type | Description |
|---|---|---|
duration_ms |
int | None
|
Duration in milliseconds to display each chunk (for timed sequences). |
isi_ms |
int | None
|
Inter-stimulus interval in milliseconds between chunks. |
timeout_ms |
int | None
|
Maximum time in milliseconds to wait for response. |
mask_char |
str | None
|
Character to use for masking non-current chunks (e.g., "_"). |
cumulative |
bool
|
If True, show all previous chunks; if False, show only current chunk. |
Examples:
>>> # RSVP (Rapid Serial Visual Presentation)
>>> TimingParams(
... duration_ms=250,
... isi_ms=50,
... cumulative=False,
... mask_char="_"
... )
>>> # Self-paced with timeout
>>> TimingParams(timeout_ms=5000, cumulative=True)
TaskSpec
¶
Bases: BeadBaseModel
Parameters for the response collection task.
Specifies task-specific parameters like prompts, options, scale bounds, validation rules, etc. The appropriate parameters depend on the task_type specified in ItemTemplate. The task_type itself is not included here since it's part of the ItemTemplate structure.
Attributes:
| Name | Type | Description |
|---|---|---|
prompt |
str
|
Question or instruction shown to participants. |
scale_bounds |
tuple[int, int] | None
|
Min and max values for ordinal_scale task. |
scale_labels |
dict[int, str] | None
|
Optional labels for specific scale points (ordinal_scale). |
options |
list[str] | None
|
Available options for forced_choice, multi_select, or categorical tasks. For forced_choice/multi_select: element names to choose from. For categorical: category labels. |
min_selections |
int | None
|
Minimum number of selections required (multi_select only). |
max_selections |
int | None
|
Maximum number of selections allowed (multi_select only). |
text_validation_pattern |
str | None
|
Regular expression pattern for validating free_text responses. |
max_length |
int | None
|
Maximum character length for free_text responses. |
span_spec |
SpanSpec | None
|
Span labeling specification (for span_labeling tasks or composite tasks with span overlays). |
Examples:
>>> # Ordinal scale task (e.g., acceptability rating)
>>> TaskSpec(
... prompt="How natural does this sentence sound?",
... scale_bounds=(1, 7),
... scale_labels={1: "Very unnatural", 7: "Very natural"}
... )
>>> # Categorical task (e.g., NLI)
>>> TaskSpec(
... prompt="What is the relationship?",
... options=["Entailment", "Neutral", "Contradiction"]
... )
>>> # Binary task
>>> TaskSpec(
... prompt="Is this sentence grammatical?"
... )
>>> # Forced choice task (e.g., minimal pair)
>>> TaskSpec(
... prompt="Which sounds more natural?",
... options=["sentence_a", "sentence_b"]
... )
>>> # Multi-select task (e.g., select all grammatical)
>>> TaskSpec(
... prompt="Select all grammatical sentences:",
... options=["sent_a", "sent_b", "sent_c"],
... min_selections=1
... )
>>> # Free text task
>>> TaskSpec(
... prompt="Who performed the action?",
... max_length=50
... )
validate_prompt(v: str) -> str
classmethod
¶
Validate prompt is not empty.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
v
|
str
|
Prompt to validate. |
required |
Returns:
| Type | Description |
|---|---|
str
|
Validated prompt. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If prompt is empty or contains only whitespace. |
PresentationSpec
¶
Bases: BeadBaseModel
Specification of stimulus presentation method.
Defines how stimuli are displayed to participants (static, self-paced, or timed sequence), including segmentation and timing parameters. Separate from judgment specification to maintain clean separation of concerns.
Attributes:
| Name | Type | Description |
|---|---|---|
mode |
PresentationMode
|
Presentation mode (static, self_paced, or timed_sequence). Defaults to "static". |
chunking |
ChunkingSpec
|
Chunking specification for incremental presentations. Defaults to word-level chunking. |
timing |
TimingParams
|
Timing parameters for timed presentations. Defaults to cumulative display with no fixed durations. |
display_format |
dict[str, str | int | float | bool]
|
Additional display formatting options. |
tokenizer_config |
TokenizerConfig | None
|
Display tokenizer configuration for span annotation. When set, controls how text is tokenized for span indexing and display. |
Examples:
>>> # Static presentation (default)
>>> PresentationSpec()
>>> # Self-paced word-by-word reading
>>> PresentationSpec(
... mode="self_paced",
... chunking=ChunkingSpec(unit="word")
... )
>>> # Self-paced by noun phrases
>>> PresentationSpec(
... mode="self_paced",
... chunking=ChunkingSpec(
... unit="constituent",
... parse_type="constituency",
... constituent_labels=["NP"],
... parser="stanza",
... parse_language="en"
... )
... )
>>> # RSVP (timed sequence)
>>> PresentationSpec(
... mode="timed_sequence",
... chunking=ChunkingSpec(unit="word"),
... timing=TimingParams(duration_ms=250, isi_ms=50, cumulative=False)
... )
ItemElement
¶
Bases: BeadBaseModel
A structured element within an item template.
ItemElements represent distinct parts of a complex item, such as context, target sentence, question, or response options. Elements can be static text or references to filled templates.
Attributes:
| Name | Type | Description |
|---|---|---|
element_type |
ElementRefType
|
Type of element ("text" or "filled_template_ref"). |
element_name |
str
|
Unique name for this element within the item. |
content |
str | None
|
Static text content (for text elements). |
filled_template_ref_id |
UUID | None
|
UUID of filled template (for reference elements). |
element_metadata |
dict[str, MetadataValue]
|
Additional element-specific metadata. |
order |
int | None
|
Display order for this element (optional). |
Examples:
>>> # Text element
>>> context = ItemElement(
... element_type="text",
... element_name="context",
... content="Mary loves books.",
... order=1
... )
>>> # Template reference element
>>> target = ItemElement(
... element_type="filled_template_ref",
... element_name="target",
... filled_template_ref_id=UUID("..."),
... order=2
... )
is_text: bool
property
¶
Check if this is a text element.
Returns:
| Type | Description |
|---|---|
bool
|
True if element_type is "text". |
is_template_ref: bool
property
¶
Check if this references a filled template.
Returns:
| Type | Description |
|---|---|
bool
|
True if element_type is "filled_template_ref". |
validate_element_name(v: str) -> str
classmethod
¶
Validate element name is not empty.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
v
|
str
|
Element name to validate. |
required |
Returns:
| Type | Description |
|---|---|
str
|
Validated element name. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If name is empty or contains only whitespace. |
ItemTemplate
¶
Bases: BeadBaseModel
Template specification for constructing experimental items.
ItemTemplate defines how to construct an experimental item with three orthogonal dimensions: what semantic property to measure (judgment_type), how to collect the response (task_type), and how to present the stimulus (presentation_spec).
This is distinct from Template (in bead.resources.structures), which defines linguistic structure. ItemTemplate defines experimental structure.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
Template name (e.g., "acceptability_rating"). |
description |
str | None
|
Human-readable description of this item template. |
judgment_type |
JudgmentType
|
Semantic property being measured (acceptability, inference, etc.). |
task_type |
TaskType
|
Response collection method (forced_choice, ordinal_scale, etc.). |
elements |
list[ItemElement]
|
Elements that compose this item. |
constraints |
list[UUID]
|
UUIDs of constraints on items (typically model-based). |
task_spec |
TaskSpec
|
Task-specific parameters (prompt, options, scale bounds, etc.). |
presentation_spec |
PresentationSpec
|
Specification of how to present stimuli. |
presentation_order |
list[str] | None
|
Order to present elements (by element_name). |
template_metadata |
dict[str, MetadataValue]
|
Additional template metadata. |
Examples:
>>> # Acceptability judgment with ordinal scale task
>>> template = ItemTemplate(
... name="acceptability_rating",
... judgment_type="acceptability",
... task_type="ordinal_scale",
... task_spec=TaskSpec(
... prompt="How natural is this sentence?",
... scale_bounds=(1, 7),
... scale_labels={1: "Very unnatural", 7: "Very natural"}
... ),
... presentation_spec=PresentationSpec(mode="static"),
... elements=[
... ItemElement(
... element_type="filled_template_ref",
... element_name="sentence",
... filled_template_ref_id=UUID("...")
... )
... ]
... )
>>> # Minimal pair: acceptability judgment with forced choice task
>>> minimal_pair = ItemTemplate(
... name="minimal_pair",
... judgment_type="acceptability",
... task_type="forced_choice",
... elements=[
... ItemElement(
... element_type="text", element_name="sent_a", content="Who..."
... ),
... ItemElement(
... element_type="text", element_name="sent_b", content="Whom..."
... )
... ],
... task_spec=TaskSpec(
... prompt="Which sounds more natural?",
... options=["sent_a", "sent_b"]
... ),
... presentation_spec=PresentationSpec(mode="static")
... )
>>> # Odd-man-out: similarity judgment with forced choice task
>>> odd_man_out = ItemTemplate(
... name="odd_man_out",
... judgment_type="similarity",
... task_type="forced_choice",
... elements=[...], # 4 elements
... task_spec=TaskSpec(
... prompt="Which is most different?",
... options=["opt_a", "opt_b", "opt_c", "opt_d"]
... ),
... presentation_spec=PresentationSpec(mode="static")
... )
validate_name(v: str) -> str
classmethod
¶
Validate template name is not empty.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
v
|
str
|
Template name to validate. |
required |
Returns:
| Type | Description |
|---|---|
str
|
Validated template name. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If name is empty or contains only whitespace. |
validate_unique_element_names(v: list[ItemElement]) -> list[ItemElement]
classmethod
¶
Validate all element names are unique within template.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
v
|
list[ItemElement]
|
List of elements to validate. |
required |
Returns:
| Type | Description |
|---|---|
list[ItemElement]
|
Validated elements. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If duplicate element names found. |
validate_presentation_order(v: list[str] | None, info: ValidationInfo) -> list[str] | None
classmethod
¶
Validate presentation_order matches element names.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
v
|
list[str] | None
|
Presentation order list to validate. |
required |
info
|
ValidationInfo
|
Pydantic validation info containing other field values. |
required |
Returns:
| Type | Description |
|---|---|
list[str] | None
|
Validated presentation order. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If presentation_order contains names not in elements, or is missing names from elements. |
get_element_by_name(name: str) -> ItemElement | None
¶
Get an element by its name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Element name to search for. |
required |
Returns:
| Type | Description |
|---|---|
ItemElement | None
|
Element with matching name, or None if not found. |
Examples:
get_template_ref_elements() -> list[ItemElement]
¶
Get all elements that reference filled templates.
Returns:
| Type | Description |
|---|---|
list[ItemElement]
|
Elements with element_type="filled_template_ref". |
Examples:
ItemTemplateCollection
¶
Bases: BeadBaseModel
A collection of item templates.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
Name of this collection. |
description |
str | None
|
Description of this collection. |
templates |
list[ItemTemplate]
|
Item templates in this collection. |
Examples:
>>> collection = ItemTemplateCollection(
... name="acceptability_study",
... description="Templates for acceptability judgments"
... )
>>> collection.add_template(template)
validate_name(v: str) -> str
classmethod
¶
Validate collection name is not empty.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
v
|
str
|
Collection name to validate. |
required |
Returns:
| Type | Description |
|---|---|
str
|
Validated collection name. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If name is empty or contains only whitespace. |
add_template(template: ItemTemplate) -> None
¶
Add a template to the collection.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
template
|
ItemTemplate
|
Template to add. |
required |
Examples:
Task-Type Utilities¶
forced_choice
¶
Utilities for creating N-AFC (forced-choice) experimental items.
This module provides language-agnostic utilities for creating forced-choice items where participants select from N alternatives (2AFC, 3AFC, 4AFC, etc.).
create_forced_choice_item(*options: str, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item
¶
Create an N-AFC (forced-choice) item from N text options.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
*options
|
str
|
Text for each option (2 or more required). |
()
|
item_template_id
|
UUID | None
|
Template ID for the item. If None, generates new UUID. |
None
|
metadata
|
dict[str, MetadataValue] | None
|
Additional metadata for item_metadata field. |
None
|
Returns:
| Type | Description |
|---|---|
Item
|
Forced-choice item with options stored in the options field. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If fewer than 2 options provided. |
Examples:
create_forced_choice_items_from_groups(items: list[Item], group_by: Callable[[Item], Any], n_alternatives: int = 2, *, extract_text: Callable[[Item], str] | None = None, include_group_metadata: bool = True, item_template_id: UUID | None = None) -> list[Item]
¶
Create forced-choice items by grouping source items.
Groups items by a property, then creates all N-way combinations within each group as forced-choice items.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
items
|
list[Item]
|
Source items to group and combine. |
required |
group_by
|
Callable[[Item], Any]
|
Function to extract grouping key from items. |
required |
n_alternatives
|
int
|
Number of alternatives per forced-choice item (default: 2 for 2AFC). |
2
|
extract_text
|
Callable[[Item], str] | None
|
Function to extract text from item. If None, tries common keys ("text", "sentence", "content") from rendered_elements. |
None
|
include_group_metadata
|
bool
|
Whether to include group key in item metadata. |
True
|
item_template_id
|
UUID | None
|
Template ID for all created items. If None, generates one per item. |
None
|
Returns:
| Type | Description |
|---|---|
list[Item]
|
Forced-choice items created from groupings. |
Examples:
Create 2AFC items with same verb (same-verb minimal pairs):
>>> items = [
... Item(
... item_template_id=uuid4(),
... rendered_elements={"text": "She walks."},
... item_metadata={"verb": "walk", "frame": "intransitive"}
... ),
... Item(
... item_template_id=uuid4(),
... rendered_elements={"text": "She walks the dog."},
... item_metadata={"verb": "walk", "frame": "transitive"}
... )
... ]
>>> fc_items = create_forced_choice_items_from_groups(
... items,
... group_by=lambda item: item.item_metadata["verb"],
... n_alternatives=2
... )
>>> len(fc_items)
1
>>> fc_items[0].rendered_elements["option_a"]
'She walks.'
Create 3AFC items grouped by template:
create_forced_choice_items_cross_product(group1_items: list[Item], group2_items: list[Item], n_from_group1: int = 1, n_from_group2: int = 1, *, extract_text: Callable[[Item], str] | None = None, item_template_id: UUID | None = None, metadata_fn: Callable[[list[Item], list[Item]], dict[str, MetadataValue]] | None = None) -> list[Item]
¶
Create forced-choice items from cross-product of two groups.
Combines n items from group1 with n items from group2 to create (n_from_group1 + n_from_group2)-AFC items.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
group1_items
|
list[Item]
|
Items in first group. |
required |
group2_items
|
list[Item]
|
Items in second group. |
required |
n_from_group1
|
int
|
Number of items to select from group1 per combination (default: 1). |
1
|
n_from_group2
|
int
|
Number of items to select from group2 per combination (default: 1). |
1
|
extract_text
|
Callable[[Item], str] | None
|
Function to extract text from items. |
None
|
item_template_id
|
UUID | None
|
Template ID for all created items. |
None
|
metadata_fn
|
Callable[[list[Item], list[Item]], dict[str, MetadataValue]] | None
|
Function to generate metadata from (group1_items_used, group2_items_used). |
None
|
Returns:
| Type | Description |
|---|---|
list[Item]
|
Forced-choice items from cross-product. |
Examples:
Create 2AFC items pairing grammatical with ungrammatical:
>>> grammatical = [
... Item(
... uuid4(),
... rendered_elements={"text": "She walks."},
... item_metadata={"grammatical": True}
... )
... ]
>>> ungrammatical = [
... Item(
... uuid4(),
... rendered_elements={"text": "She walk."},
... item_metadata={"grammatical": False}
... )
... ]
>>> fc_items = create_forced_choice_items_cross_product(
... grammatical,
... ungrammatical,
... n_from_group1=1,
... n_from_group2=1
... )
>>> len(fc_items)
1
create_filtered_forced_choice_items(items: list[Item], group_by: Callable[[Item], Any], n_alternatives: int = 2, *, item_filter: Callable[[Item], bool] | None = None, group_filter: Callable[[Any, list[Item]], bool] | None = None, combination_filter: Callable[[tuple[Item, ...]], bool] | None = None, extract_text: Callable[[Item], str] | None = None, item_template_id: UUID | None = None) -> list[Item]
¶
Create forced-choice items with multi-level filtering.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
items
|
list[Item]
|
Source items. |
required |
group_by
|
Callable[[Item], Any]
|
Grouping function. |
required |
n_alternatives
|
int
|
Number of alternatives per item. |
2
|
item_filter
|
Callable[[Item], bool] | None
|
Filter individual items before grouping. |
None
|
group_filter
|
Callable[[Any, list[Item]], bool] | None
|
Filter groups (receives group_key and group_items). |
None
|
combination_filter
|
Callable[[tuple[Item, ...]], bool] | None
|
Filter specific combinations. |
None
|
extract_text
|
Callable[[Item], str] | None
|
Text extraction function. |
None
|
item_template_id
|
UUID | None
|
Template ID for created items. |
None
|
Returns:
| Type | Description |
|---|---|
list[Item]
|
Filtered forced-choice items. |
Examples:
>>> fc_items = create_filtered_forced_choice_items(
... items,
... group_by=lambda i: i.item_metadata["verb"],
... n_alternatives=2,
... item_filter=lambda i: i.item_metadata.get("valid", True),
... group_filter=lambda key, items: len(items) >= 2,
... combination_filter=lambda combo: combo[0].id != combo[1].id
... )
ordinal_scale
¶
Utilities for creating ordinal scale experimental items.
This module provides language-agnostic utilities for creating ordinal scale items where participants rate a single stimulus on an ordered discrete scale (e.g., 1-7 Likert scale, acceptability ratings).
Integration Points
- Active Learning: bead/active_learning/models/ordinal_scale.py
- Simulation: bead/simulation/strategies/ordinal_scale.py
- Deployment: bead/deployment/jspsych/ (slider or radio buttons)
create_ordinal_scale_item(text: str, scale_bounds: tuple[int, int] = (1, 7), prompt: str | None = None, scale_labels: dict[int, str] | None = None, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item
¶
Create an ordinal scale rating item.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
The stimulus text to rate. |
required |
scale_bounds
|
tuple[int, int]
|
Tuple of (min, max) for the scale. Both must be integers with min < max. Default: (1, 7) for a 7-point scale. |
(1, 7)
|
prompt
|
str | None
|
Optional question/prompt for the rating. If None, uses "Rate this item:". |
None
|
scale_labels
|
dict[int, str] | None
|
Optional labels for specific scale values (e.g., {1: "Bad", 7: "Good"}). All keys must be within [scale_min, scale_max]. |
None
|
item_template_id
|
UUID | None
|
Template ID for the item. If None, generates new UUID. |
None
|
metadata
|
dict[str, MetadataValue] | None
|
Additional metadata for item_metadata field. |
None
|
Returns:
| Type | Description |
|---|---|
Item
|
Ordinal scale item with text and prompt in rendered_elements. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If text is empty, if scale_bounds are invalid, or if scale_labels contain values outside scale bounds. |
Examples:
>>> item = create_ordinal_scale_item(
... text="The cat sat on the mat.",
... scale_bounds=(1, 7),
... prompt="How natural is this sentence?",
... metadata={"task": "acceptability"}
... )
>>> item.rendered_elements["text"]
'The cat sat on the mat.'
>>> item.item_metadata["scale_min"]
1
>>> item.item_metadata["scale_max"]
7
create_ordinal_scale_items_from_texts(texts: list[str], scale_bounds: tuple[int, int] = (1, 7), prompt: str | None = None, scale_labels: dict[int, str] | None = None, *, item_template_id: UUID | None = None, metadata_fn: Callable[[str], dict[str, MetadataValue]] | None = None) -> list[Item]
¶
Create ordinal scale items from a list of texts.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
texts
|
list[str]
|
List of stimulus texts. |
required |
scale_bounds
|
tuple[int, int]
|
Scale bounds (min, max) for all items. |
(1, 7)
|
prompt
|
str | None
|
The question/prompt for all items. |
None
|
scale_labels
|
dict[int, str] | None
|
Optional scale labels for all items. |
None
|
item_template_id
|
UUID | None
|
Template ID for all created items. If None, generates one per item. |
None
|
metadata_fn
|
Callable[[str], dict[str, MetadataValue]] | None
|
Function to generate metadata from each text. |
None
|
Returns:
| Type | Description |
|---|---|
list[Item]
|
Ordinal scale items for each text. |
Examples:
>>> texts = ["She walks.", "She walk.", "They walk."]
>>> items = create_ordinal_scale_items_from_texts(
... texts,
... scale_bounds=(1, 5),
... prompt="How acceptable is this sentence?",
... metadata_fn=lambda t: {"text_length": len(t)}
... )
>>> len(items)
3
>>> items[0].item_metadata["scale_min"]
1
create_ordinal_scale_items_from_groups(items: list[Item], group_by: Callable[[Item], Hashable], scale_bounds: tuple[int, int] = (1, 7), prompt: str | None = None, scale_labels: dict[int, str] | None = None, *, extract_text: Callable[[Item], str] | None = None, include_group_metadata: bool = True, item_template_id: UUID | None = None) -> list[Item]
¶
Create ordinal scale items from grouped source items.
Groups items and creates one ordinal scale item per source item, preserving group information in metadata.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
items
|
list[Item]
|
Source items to process. |
required |
group_by
|
Callable[[Item], Hashable]
|
Function to extract grouping key from items. |
required |
scale_bounds
|
tuple[int, int]
|
Scale bounds (min, max) for all items. |
(1, 7)
|
prompt
|
str | None
|
The question/prompt for all items. |
None
|
scale_labels
|
dict[int, str] | None
|
Optional scale labels for all items. |
None
|
extract_text
|
Callable[[Item], str] | None
|
Function to extract text from item. If None, tries common keys. |
None
|
include_group_metadata
|
bool
|
Whether to include group key in item metadata. |
True
|
item_template_id
|
UUID | None
|
Template ID for all created items. If None, generates one per item. |
None
|
Returns:
| Type | Description |
|---|---|
list[Item]
|
Ordinal scale items from source items. |
Examples:
>>> source_items = [
... Item(
... uuid4(),
... rendered_elements={"text": "She walks."},
... item_metadata={"verb": "walk"}
... )
... ]
>>> ordinal_items = create_ordinal_scale_items_from_groups(
... source_items,
... group_by=lambda i: i.item_metadata["verb"],
... scale_bounds=(1, 7),
... prompt="Rate the acceptability:"
... )
>>> len(ordinal_items)
1
create_ordinal_scale_items_cross_product(texts: list[str], prompts: list[str], scale_bounds: tuple[int, int] = (1, 7), scale_labels: dict[int, str] | None = None, *, item_template_id: UUID | None = None, metadata_fn: Callable[[str, str], dict[str, MetadataValue]] | None = None) -> list[Item]
¶
Create ordinal scale items from cross-product of texts and prompts.
Useful when you want to apply multiple prompts to each text.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
texts
|
list[str]
|
List of stimulus texts. |
required |
prompts
|
list[str]
|
List of prompts to apply. |
required |
scale_bounds
|
tuple[int, int]
|
Scale bounds (min, max) for all items. |
(1, 7)
|
scale_labels
|
dict[int, str] | None
|
Optional scale labels for all items. |
None
|
item_template_id
|
UUID | None
|
Template ID for all created items. |
None
|
metadata_fn
|
Callable[[str, str], dict[str, MetadataValue]] | None
|
Function to generate metadata from (text, prompt). |
None
|
Returns:
| Type | Description |
|---|---|
list[Item]
|
Ordinal scale items from cross-product. |
Examples:
create_filtered_ordinal_scale_items(items: list[Item], scale_bounds: tuple[int, int] = (1, 7), prompt: str | None = None, scale_labels: dict[int, str] | None = None, *, item_filter: Callable[[Item], bool] | None = None, extract_text: Callable[[Item], str] | None = None, item_template_id: UUID | None = None) -> list[Item]
¶
Create ordinal scale items with filtering.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
items
|
list[Item]
|
Source items. |
required |
scale_bounds
|
tuple[int, int]
|
Scale bounds (min, max) for all items. |
(1, 7)
|
prompt
|
str | None
|
The question/prompt for all items. |
None
|
scale_labels
|
dict[int, str] | None
|
Optional scale labels for all items. |
None
|
item_filter
|
Callable[[Item], bool] | None
|
Filter individual items. |
None
|
extract_text
|
Callable[[Item], str] | None
|
Text extraction function. |
None
|
item_template_id
|
UUID | None
|
Template ID for created items. |
None
|
Returns:
| Type | Description |
|---|---|
list[Item]
|
Filtered ordinal scale items. |
Examples:
create_likert_5_item(text: str, prompt: str | None = None, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item
¶
Create a 5-point Likert scale item.
Convenience function for standard 5-point Likert scale with "Strongly Disagree" to "Strongly Agree" labels.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
The stimulus text (statement) to rate. |
required |
prompt
|
str | None
|
Optional prompt. If None, uses "Rate your agreement:". |
None
|
item_template_id
|
UUID | None
|
Template ID for the item. If None, generates new UUID. |
None
|
metadata
|
dict[str, MetadataValue] | None
|
Additional metadata for item_metadata field. |
None
|
Returns:
| Type | Description |
|---|---|
Item
|
5-point Likert scale item. |
Examples:
create_likert_7_item(text: str, prompt: str | None = None, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item
¶
Create a 7-point Likert scale item.
Convenience function for standard 7-point Likert scale with "Strongly Disagree" to "Strongly Agree" labels.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
The stimulus text (statement) to rate. |
required |
prompt
|
str | None
|
Optional prompt. If None, uses "Rate your agreement:". |
None
|
item_template_id
|
UUID | None
|
Template ID for the item. If None, generates new UUID. |
None
|
metadata
|
dict[str, MetadataValue] | None
|
Additional metadata for item_metadata field. |
None
|
Returns:
| Type | Description |
|---|---|
Item
|
7-point Likert scale item. |
Examples:
binary
¶
Utilities for creating binary experimental items.
This module provides language-agnostic utilities for creating binary items where participants make yes/no or true/false judgments about a single stimulus.
IMPORTANT: Binary tasks are semantically distinct from 2AFC tasks: - Binary: Absolute judgment about single stimulus ("Is this grammatical?") - 2AFC: Relative choice between two stimuli ("Which is more natural?")
Integration Points
- Active Learning: bead/active_learning/models/binary.py
- Simulation: bead/simulation/strategies/binary.py
- Deployment: bead/deployment/jspsych/ (binary button plugin)
create_binary_item(text: str, prompt: str = 'Yes/No?', binary_options: tuple[str, str] = ('yes', 'no'), item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item
¶
Create a binary judgment item for a single stimulus.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
The stimulus text to judge. |
required |
prompt
|
str
|
The question/prompt for the judgment (default: "Yes/No?"). |
'Yes/No?'
|
binary_options
|
tuple[str, str]
|
The two response options (default: ("yes", "no")). Can also be ("true", "false"), ("acceptable", "unacceptable"), etc. |
('yes', 'no')
|
item_template_id
|
UUID | None
|
Template ID for the item. If None, generates new UUID. |
None
|
metadata
|
dict[str, MetadataValue] | None
|
Additional metadata for item_metadata field. |
None
|
Returns:
| Type | Description |
|---|---|
Item
|
Binary item with text and prompt in rendered_elements. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If text is empty or if binary_options doesn't have exactly 2 values. |
Examples:
>>> item = create_binary_item(
... "The cat sat on the mat.",
... prompt="Is this sentence grammatical?",
... metadata={"judgment": "grammaticality"}
... )
>>> item.rendered_elements["text"]
'The cat sat on the mat.'
>>> item.rendered_elements["prompt"]
'Is this sentence grammatical?'
>>> item.item_metadata["binary_options"]
['yes', 'no']
create_binary_items_from_texts(texts: list[str], prompt: str, binary_options: tuple[str, str] = ('yes', 'no'), *, item_template_id: UUID | None = None, metadata_fn: Callable[[str], dict[str, MetadataValue]] | None = None) -> list[Item]
¶
Create binary items from a list of texts with the same prompt.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
texts
|
list[str]
|
List of stimulus texts. |
required |
prompt
|
str
|
The question/prompt for all items. |
required |
binary_options
|
tuple[str, str]
|
The two response options (default: ("yes", "no")). |
('yes', 'no')
|
item_template_id
|
UUID | None
|
Template ID for all created items. If None, generates one per item. |
None
|
metadata_fn
|
Callable[[str], dict[str, MetadataValue]] | None
|
Function to generate metadata from each text. |
None
|
Returns:
| Type | Description |
|---|---|
list[Item]
|
Binary items for each text. |
Examples:
create_binary_items_with_context(contexts: list[str], targets: list[str], prompt: str, binary_options: tuple[str, str] = ('yes', 'no'), *, context_label: str = 'Context', target_label: str = 'Statement', item_template_id: UUID | None = None, metadata_fn: Callable[[str, str], dict[str, MetadataValue]] | None = None) -> list[Item]
¶
Create binary items with context + target structure.
Useful for judgments like "Given context X, is statement Y true?".
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
contexts
|
list[str]
|
Context texts (same length as targets). |
required |
targets
|
list[str]
|
Target texts to judge given context. |
required |
prompt
|
str
|
The question/prompt for the judgment. |
required |
binary_options
|
tuple[str, str]
|
The two response options (default: ("yes", "no")). |
('yes', 'no')
|
context_label
|
str
|
Label for context in rendered text (default: "Context"). |
'Context'
|
target_label
|
str
|
Label for target in rendered text (default: "Statement"). |
'Statement'
|
item_template_id
|
UUID | None
|
Template ID for all created items. If None, generates one per item. |
None
|
metadata_fn
|
Callable[[str, str], dict[str, MetadataValue]] | None
|
Function to generate metadata from (context, target). |
None
|
Returns:
| Type | Description |
|---|---|
list[Item]
|
Binary items with context + target structure. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If contexts and targets have different lengths. |
Examples:
>>> contexts = ["The dog barked loudly."]
>>> targets = ["The dog made a sound."]
>>> items = create_binary_items_with_context(
... contexts,
... targets,
... prompt="Is the statement true given the context?",
... binary_options=("true", "false")
... )
>>> len(items)
1
>>> "Context:" in items[0].rendered_elements["text"]
True
create_binary_items_from_groups(items: list[Item], group_by: Callable[[Item], Hashable], prompt: str, binary_options: tuple[str, str] = ('yes', 'no'), *, extract_text: Callable[[Item], str] | None = None, include_group_metadata: bool = True, item_template_id: UUID | None = None) -> list[Item]
¶
Create binary items from grouped source items.
Groups items and creates one binary item per source item, preserving group information in metadata.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
items
|
list[Item]
|
Source items to process. |
required |
group_by
|
Callable[[Item], Hashable]
|
Function to extract grouping key from items. |
required |
prompt
|
str
|
The question/prompt for all items. |
required |
binary_options
|
tuple[str, str]
|
The two response options (default: ("yes", "no")). |
('yes', 'no')
|
extract_text
|
Callable[[Item], str] | None
|
Function to extract text from item. If None, tries common keys. |
None
|
include_group_metadata
|
bool
|
Whether to include group key in item metadata. |
True
|
item_template_id
|
UUID | None
|
Template ID for all created items. If None, generates one per item. |
None
|
Returns:
| Type | Description |
|---|---|
list[Item]
|
Binary items from source items. |
Examples:
>>> source_items = [
... Item(
... uuid4(),
... rendered_elements={"text": "She walks."},
... item_metadata={"verb": "walk"}
... ),
... Item(
... uuid4(),
... rendered_elements={"text": "She runs."},
... item_metadata={"verb": "run"}
... )
... ]
>>> binary_items = create_binary_items_from_groups(
... source_items,
... group_by=lambda i: i.item_metadata["verb"],
... prompt="Is this sentence grammatical?"
... )
>>> len(binary_items)
2
create_binary_items_cross_product(texts: list[str], prompts: list[str], binary_options: tuple[str, str] = ('yes', 'no'), *, item_template_id: UUID | None = None, metadata_fn: Callable[[str, str], dict[str, MetadataValue]] | None = None) -> list[Item]
¶
Create binary items from cross-product of texts and prompts.
Useful when you want to apply multiple prompts to each text.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
texts
|
list[str]
|
List of stimulus texts. |
required |
prompts
|
list[str]
|
List of prompts to apply. |
required |
binary_options
|
tuple[str, str]
|
The two response options (default: ("yes", "no")). |
('yes', 'no')
|
item_template_id
|
UUID | None
|
Template ID for all created items. |
None
|
metadata_fn
|
Callable[[str, str], dict[str, MetadataValue]] | None
|
Function to generate metadata from (text, prompt). |
None
|
Returns:
| Type | Description |
|---|---|
list[Item]
|
Binary items from cross-product. |
Examples:
create_filtered_binary_items(items: list[Item], prompt: str, binary_options: tuple[str, str] = ('yes', 'no'), *, item_filter: Callable[[Item], bool] | None = None, extract_text: Callable[[Item], str] | None = None, item_template_id: UUID | None = None) -> list[Item]
¶
Create binary items with filtering.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
items
|
list[Item]
|
Source items. |
required |
prompt
|
str
|
The question/prompt for all items. |
required |
binary_options
|
tuple[str, str]
|
The two response options (default: ("yes", "no")). |
('yes', 'no')
|
item_filter
|
Callable[[Item], bool] | None
|
Filter individual items. |
None
|
extract_text
|
Callable[[Item], str] | None
|
Text extraction function. |
None
|
item_template_id
|
UUID | None
|
Template ID for created items. |
None
|
Returns:
| Type | Description |
|---|---|
list[Item]
|
Filtered binary items. |
Examples:
categorical
¶
Utilities for creating categorical experimental items.
This module provides language-agnostic utilities for creating categorical items where participants select from N unordered categories (e.g., NLI labels, POS tags, semantic relations).
Integration Points
- Active Learning: bead/active_learning/models/categorical.py
- Simulation: bead/simulation/strategies/categorical.py
- Deployment: bead/deployment/jspsych/ (dropdown or radio buttons)
create_categorical_item(text: str, categories: list[str], prompt: str | None = None, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item
¶
Create a categorical classification item.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
The stimulus text to classify. |
required |
categories
|
list[str]
|
List of category labels (unordered). Must have at least 2 categories. |
required |
prompt
|
str | None
|
Optional question/prompt for the classification. If None, uses "Select a category:". |
None
|
item_template_id
|
UUID | None
|
Template ID for the item. If None, generates new UUID. |
None
|
metadata
|
dict[str, MetadataValue] | None
|
Additional metadata for item_metadata field. |
None
|
Returns:
| Type | Description |
|---|---|
Item
|
Categorical item with text and prompt in rendered_elements. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If text is empty or if fewer than 2 categories provided. |
Examples:
>>> item = create_categorical_item(
... text="Premise: All dogs bark. Hypothesis: Some dogs bark.",
... categories=["entailment", "neutral", "contradiction"],
... prompt="What is the relationship?",
... metadata={"task": "nli"}
... )
>>> item.rendered_elements["text"]
'Premise: All dogs bark. Hypothesis: Some dogs bark.'
>>> item.rendered_elements["prompt"]
'What is the relationship?'
>>> item.item_metadata["categories"]
['entailment', 'neutral', 'contradiction']
create_nli_item(premise: str, hypothesis: str, categories: list[str] | None = None, prompt: str | None = None, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item
¶
Create a Natural Language Inference (NLI) item.
Specialized helper for NLI tasks with automatic formatting and default categories.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
premise
|
str
|
The premise text. |
required |
hypothesis
|
str
|
The hypothesis text. |
required |
categories
|
list[str] | None
|
Category labels. If None, uses ["entailment", "neutral", "contradiction"]. |
None
|
prompt
|
str | None
|
Question/prompt. If None, uses "What is the relationship?". |
None
|
item_template_id
|
UUID | None
|
Template ID for the item. If None, generates new UUID. |
None
|
metadata
|
dict[str, MetadataValue] | None
|
Additional metadata for item_metadata field. |
None
|
Returns:
| Type | Description |
|---|---|
Item
|
NLI categorical item. |
Examples:
>>> item = create_nli_item(
... premise="All dogs bark.",
... hypothesis="Some dogs bark."
... )
>>> "Premise:" in item.rendered_elements["text"]
True
>>> "Hypothesis:" in item.rendered_elements["text"]
True
>>> item.item_metadata["categories"]
['entailment', 'neutral', 'contradiction']
>>> item.item_metadata["premise"]
'All dogs bark.'
create_categorical_items_from_texts(texts: list[str], categories: list[str], prompt: str | None = None, *, item_template_id: UUID | None = None, metadata_fn: Callable[[str], dict[str, MetadataValue]] | None = None) -> list[Item]
¶
Create categorical items from a list of texts with the same categories.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
texts
|
list[str]
|
List of stimulus texts. |
required |
categories
|
list[str]
|
Category labels for all items. |
required |
prompt
|
str | None
|
The question/prompt for all items. |
None
|
item_template_id
|
UUID | None
|
Template ID for all created items. If None, generates one per item. |
None
|
metadata_fn
|
Callable[[str], dict[str, MetadataValue]] | None
|
Function to generate metadata from each text. |
None
|
Returns:
| Type | Description |
|---|---|
list[Item]
|
Categorical items for each text. |
Examples:
>>> texts = ["The cat sat.", "The dog ran.", "The bird flew."]
>>> categories = ["past", "present", "future"]
>>> items = create_categorical_items_from_texts(
... texts,
... categories=categories,
... prompt="What is the tense?"
... )
>>> len(items)
3
>>> items[0].item_metadata["categories"]
['past', 'present', 'future']
create_categorical_items_from_pairs(pairs: list[tuple[str, str]], categories: list[str], prompt: str | None = None, *, pair_label1: str = 'Text 1', pair_label2: str = 'Text 2', item_template_id: UUID | None = None, metadata_fn: Callable[[str, str], dict[str, MetadataValue]] | None = None) -> list[Item]
¶
Create categorical items from pairs of texts.
Useful for NLI, paraphrase detection, semantic similarity, etc.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pairs
|
list[tuple[str, str]]
|
List of (text1, text2) pairs. |
required |
categories
|
list[str]
|
Category labels for all items. |
required |
prompt
|
str | None
|
The question/prompt for all items. |
None
|
pair_label1
|
str
|
Label for first text in pair (default: "Text 1"). |
'Text 1'
|
pair_label2
|
str
|
Label for second text in pair (default: "Text 2"). |
'Text 2'
|
item_template_id
|
UUID | None
|
Template ID for all created items. If None, generates one per item. |
None
|
metadata_fn
|
Callable[[str, str], dict[str, MetadataValue]] | None
|
Function to generate metadata from (text1, text2). |
None
|
Returns:
| Type | Description |
|---|---|
list[Item]
|
Categorical items from pairs. |
Examples:
>>> pairs = [
... ("All dogs bark.", "Some dogs bark."),
... ("The sky is blue.", "The sky is not blue.")
... ]
>>> items = create_categorical_items_from_pairs(
... pairs,
... categories=["entailment", "neutral", "contradiction"],
... prompt="What is the relationship?",
... pair_label1="Premise",
... pair_label2="Hypothesis"
... )
>>> len(items)
2
>>> "Premise:" in items[0].rendered_elements["text"]
True
create_categorical_items_from_groups(items: list[Item], group_by: Callable[[Item], Hashable], categories: list[str], prompt: str | None = None, *, extract_text: Callable[[Item], str] | None = None, include_group_metadata: bool = True, item_template_id: UUID | None = None) -> list[Item]
¶
Create categorical items from grouped source items.
Groups items and creates one categorical item per source item, preserving group information in metadata.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
items
|
list[Item]
|
Source items to process. |
required |
group_by
|
Callable[[Item], Hashable]
|
Function to extract grouping key from items. |
required |
categories
|
list[str]
|
Category labels for all items. |
required |
prompt
|
str | None
|
The question/prompt for all items. |
None
|
extract_text
|
Callable[[Item], str] | None
|
Function to extract text from item. If None, tries common keys. |
None
|
include_group_metadata
|
bool
|
Whether to include group key in item metadata. |
True
|
item_template_id
|
UUID | None
|
Template ID for all created items. If None, generates one per item. |
None
|
Returns:
| Type | Description |
|---|---|
list[Item]
|
Categorical items from source items. |
Examples:
>>> source_items = [
... Item(
... uuid4(),
... rendered_elements={"text": "The cat sat."},
... item_metadata={"tense": "past"}
... ),
... Item(
... uuid4(),
... rendered_elements={"text": "The dog runs."},
... item_metadata={"tense": "present"}
... )
... ]
>>> categorical_items = create_categorical_items_from_groups(
... source_items,
... group_by=lambda i: i.item_metadata["tense"],
... categories=["past", "present", "future"],
... prompt="What is the tense?"
... )
>>> len(categorical_items)
2
create_categorical_items_cross_product(texts: list[str], prompts: list[str], categories: list[str], *, item_template_id: UUID | None = None, metadata_fn: Callable[[str, str], dict[str, MetadataValue]] | None = None) -> list[Item]
¶
Create categorical items from cross-product of texts and prompts.
Useful when you want to apply multiple prompts to each text.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
texts
|
list[str]
|
List of stimulus texts. |
required |
prompts
|
list[str]
|
List of prompts to apply. |
required |
categories
|
list[str]
|
Category labels for all items. |
required |
item_template_id
|
UUID | None
|
Template ID for all created items. |
None
|
metadata_fn
|
Callable[[str, str], dict[str, MetadataValue]] | None
|
Function to generate metadata from (text, prompt). |
None
|
Returns:
| Type | Description |
|---|---|
list[Item]
|
Categorical items from cross-product. |
Examples:
create_filtered_categorical_items(items: list[Item], categories: list[str], prompt: str | None = None, *, item_filter: Callable[[Item], bool] | None = None, extract_text: Callable[[Item], str] | None = None, item_template_id: UUID | None = None) -> list[Item]
¶
Create categorical items with filtering.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
items
|
list[Item]
|
Source items. |
required |
categories
|
list[str]
|
Category labels for all items. |
required |
prompt
|
str | None
|
The question/prompt for all items. |
None
|
item_filter
|
Callable[[Item], bool] | None
|
Filter individual items. |
None
|
extract_text
|
Callable[[Item], str] | None
|
Text extraction function. |
None
|
item_template_id
|
UUID | None
|
Template ID for created items. |
None
|
Returns:
| Type | Description |
|---|---|
list[Item]
|
Filtered categorical items. |
Examples:
multi_select
¶
Utilities for creating multi-select experimental items.
This module provides language-agnostic utilities for creating multi-select items where participants select one or more options from a set (checkboxes).
Integration Points
- Active Learning: bead/active_learning/models/multi_select.py
- Simulation: bead/simulation/strategies/multi_select.py
- Deployment: bead/deployment/jspsych/ (checkbox plugin)
create_multi_select_item(*options: str, min_selections: int = 1, max_selections: int | None = None, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item
¶
Create a multi-select item from N text options.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
*options
|
str
|
Text for each option (2 or more required). |
()
|
min_selections
|
int
|
Minimum number of options that must be selected (default: 1). |
1
|
max_selections
|
int | None
|
Maximum number of options that can be selected. If None, defaults to number of options (no upper limit). |
None
|
item_template_id
|
UUID | None
|
Template ID for the item. If None, generates new UUID. |
None
|
metadata
|
dict[str, MetadataValue] | None
|
Additional metadata for item_metadata field. |
None
|
Returns:
| Type | Description |
|---|---|
Item
|
Multi-select item with options stored in the options field. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If fewer than 2 options provided, or if min_selections > max_selections, or if min_selections < 1, or if max_selections > number of options. |
Examples:
>>> item = create_multi_select_item(
... "She walks.",
... "She walk.",
... "They walks.",
... "They walk.",
... min_selections=1,
... max_selections=4,
... metadata={"task": "select_grammatical"}
... )
>>> item.options[0]
'She walks.'
>>> item.item_metadata["min_selections"]
1
>>> item.item_metadata["max_selections"]
4
create_multi_select_items_from_groups(items: list[Item], group_by: Callable[[Item], Any], n_options: int | None = None, min_selections: int = 1, max_selections: int | None = None, *, extract_text: Callable[[Item], str] | None = None, include_group_metadata: bool = True, item_template_id: UUID | None = None) -> list[Item]
¶
Create multi-select items by grouping source items.
Groups items by a property, then creates multi-select items from each group's items as options.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
items
|
list[Item]
|
Source items to group and combine. |
required |
group_by
|
Callable[[Item], Any]
|
Function to extract grouping key from items. |
required |
n_options
|
int | None
|
Number of options per multi-select item. If None, uses all items in each group. |
None
|
min_selections
|
int
|
Minimum number of selections required (default: 1). |
1
|
max_selections
|
int | None
|
Maximum number of selections allowed. If None, defaults to n_options. |
None
|
extract_text
|
Callable[[Item], str] | None
|
Function to extract text from item. If None, tries common keys ("text", "sentence", "content") from rendered_elements. |
None
|
include_group_metadata
|
bool
|
Whether to include group key in item metadata. |
True
|
item_template_id
|
UUID | None
|
Template ID for all created items. If None, generates one per item. |
None
|
Returns:
| Type | Description |
|---|---|
list[Item]
|
Multi-select items created from groupings. |
Examples:
Create multi-select items grouped by verb (select all acceptable frames):
>>> items = [
... Item(
... item_template_id=uuid4(),
... rendered_elements={"text": "She walks."},
... item_metadata={"verb": "walk", "frame": "intransitive"}
... ),
... Item(
... item_template_id=uuid4(),
... rendered_elements={"text": "She walks the dog."},
... item_metadata={"verb": "walk", "frame": "transitive"}
... ),
... Item(
... item_template_id=uuid4(),
... rendered_elements={"text": "She walks to school."},
... item_metadata={"verb": "walk", "frame": "intransitive_pp"}
... )
... ]
>>> ms_items = create_multi_select_items_from_groups(
... items,
... group_by=lambda item: item.item_metadata["verb"],
... min_selections=1,
... max_selections=3
... )
>>> len(ms_items)
1
>>> len(ms_items[0].rendered_elements)
3
create_multi_select_items_with_foils(correct_items: list[Item], foil_items: list[Item], n_correct: int = 2, n_foils: int = 2, *, extract_text: Callable[[Item], str] | None = None, item_template_id: UUID | None = None, metadata_fn: Callable[[list[Item], list[Item]], dict[str, MetadataValue]] | None = None) -> list[Item]
¶
Create multi-select items by combining correct items with foils.
Useful for tasks like "Select all grammatical sentences" where some options are correct and others are foils (distractors).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
correct_items
|
list[Item]
|
Items that are correct (should be selected). |
required |
foil_items
|
list[Item]
|
Items that are foils/distractors (should not be selected). |
required |
n_correct
|
int
|
Number of correct items to include per multi-select item (default: 2). |
2
|
n_foils
|
int
|
Number of foil items to include per multi-select item (default: 2). |
2
|
extract_text
|
Callable[[Item], str] | None
|
Function to extract text from items. |
None
|
item_template_id
|
UUID | None
|
Template ID for all created items. |
None
|
metadata_fn
|
Callable[[list[Item], list[Item]], dict[str, MetadataValue]] | None
|
Function to generate metadata from (correct_items_used, foil_items_used). |
None
|
Returns:
| Type | Description |
|---|---|
list[Item]
|
Multi-select items with correct items and foils. |
Examples:
>>> grammatical = [
... Item(uuid4(), rendered_elements={"text": "She walks."},
... item_metadata={"grammatical": True}),
... Item(uuid4(), rendered_elements={"text": "They walk."},
... item_metadata={"grammatical": True})
... ]
>>> ungrammatical = [
... Item(uuid4(), rendered_elements={"text": "She walk."},
... item_metadata={"grammatical": False}),
... Item(uuid4(), rendered_elements={"text": "They walks."},
... item_metadata={"grammatical": False})
... ]
>>> ms_items = create_multi_select_items_with_foils(
... grammatical,
... ungrammatical,
... n_correct=2,
... n_foils=2
... )
>>> len(ms_items)
1
>>> ms_items[0].item_metadata["min_selections"]
1
>>> ms_items[0].item_metadata["max_selections"]
4
create_multi_select_items_cross_product(group1_items: list[Item], group2_items: list[Item], n_from_group1: int = 1, n_from_group2: int = 1, min_selections: int = 1, max_selections: int | None = None, *, extract_text: Callable[[Item], str] | None = None, item_template_id: UUID | None = None, metadata_fn: Callable[[list[Item], list[Item]], dict[str, MetadataValue]] | None = None) -> list[Item]
¶
Create multi-select items from cross-product of two groups.
Combines n items from group1 with n items from group2 to create multi-select items with (n_from_group1 + n_from_group2) options.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
group1_items
|
list[Item]
|
Items in first group. |
required |
group2_items
|
list[Item]
|
Items in second group. |
required |
n_from_group1
|
int
|
Number of items to select from group1 per combination (default: 1). |
1
|
n_from_group2
|
int
|
Number of items to select from group2 per combination (default: 1). |
1
|
min_selections
|
int
|
Minimum number of selections required (default: 1). |
1
|
max_selections
|
int | None
|
Maximum number of selections allowed. If None, defaults to total options. |
None
|
extract_text
|
Callable[[Item], str] | None
|
Function to extract text from items. |
None
|
item_template_id
|
UUID | None
|
Template ID for all created items. |
None
|
metadata_fn
|
Callable[[list[Item], list[Item]], dict[str, MetadataValue]] | None
|
Function to generate metadata from (group1_items_used, group2_items_used). |
None
|
Returns:
| Type | Description |
|---|---|
list[Item]
|
Multi-select items from cross-product. |
Examples:
>>> active = [Item(uuid4(), rendered_elements={"text": "She walks."})]
>>> passive = [Item(uuid4(), rendered_elements={"text": "She is walked."})]
>>> ms_items = create_multi_select_items_cross_product(
... active, passive,
... n_from_group1=1,
... n_from_group2=1,
... min_selections=1,
... max_selections=2
... )
>>> len(ms_items)
1
create_filtered_multi_select_items(items: list[Item], group_by: Callable[[Item], Any], n_options: int | None = None, min_selections: int = 1, max_selections: int | None = None, *, item_filter: Callable[[Item], bool] | None = None, group_filter: Callable[[Any, list[Item]], bool] | None = None, combination_filter: Callable[[tuple[Item, ...]], bool] | None = None, extract_text: Callable[[Item], str] | None = None, item_template_id: UUID | None = None) -> list[Item]
¶
Create multi-select items with multi-level filtering.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
items
|
list[Item]
|
Source items. |
required |
group_by
|
Callable[[Item], Any]
|
Grouping function. |
required |
n_options
|
int | None
|
Number of options per item. If None, uses all items in each group. |
None
|
min_selections
|
int
|
Minimum number of selections required. |
1
|
max_selections
|
int | None
|
Maximum number of selections allowed. |
None
|
item_filter
|
Callable[[Item], bool] | None
|
Filter individual items before grouping. |
None
|
group_filter
|
Callable[[Any, list[Item]], bool] | None
|
Filter groups (receives group_key and group_items). |
None
|
combination_filter
|
Callable[[tuple[Item, ...]], bool] | None
|
Filter specific combinations. |
None
|
extract_text
|
Callable[[Item], str] | None
|
Text extraction function. |
None
|
item_template_id
|
UUID | None
|
Template ID for created items. |
None
|
Returns:
| Type | Description |
|---|---|
list[Item]
|
Filtered multi-select items. |
Examples:
magnitude
¶
Utilities for creating magnitude experimental items.
This module provides language-agnostic utilities for creating magnitude items where participants enter numeric values (bounded or unbounded), such as reading times, confidence ratings, or counts.
Integration Points
- Active Learning: bead/active_learning/models/magnitude.py
- Simulation: bead/simulation/strategies/magnitude.py
- Deployment: bead/deployment/jspsych/ (numeric input)
create_magnitude_item(text: str, unit: str | None = None, bounds: tuple[int | float | None, int | float | None] = (None, None), prompt: str | None = None, step: int | float | None = None, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item
¶
Create a magnitude (numeric input) item.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
The stimulus text or question. |
required |
unit
|
str | None
|
Optional unit for the value (e.g., "ms", "%", "count"). |
None
|
bounds
|
tuple[int | float | None, int | float | None]
|
Tuple of (min, max) bounds. None means unbounded in that direction. Default: (None, None) for fully unbounded. |
(None, None)
|
prompt
|
str | None
|
Optional prompt for the numeric input. If None, uses "Enter a value:". |
None
|
step
|
int | float | None
|
Optional step size for input validation (e.g., 1 for integers, 0.01 for hundredths). |
None
|
item_template_id
|
UUID | None
|
Template ID for the item. If None, generates new UUID. |
None
|
metadata
|
dict[str, MetadataValue] | None
|
Additional metadata for item_metadata field. |
None
|
Returns:
| Type | Description |
|---|---|
Item
|
Magnitude item with text and prompt in rendered_elements. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If text is empty or if both bounds are provided and min >= max. |
Examples:
>>> item = create_magnitude_item(
... text="How long did it take to read this sentence?",
... unit="ms",
... bounds=(0, None),
... prompt="Enter time in milliseconds:"
... )
>>> item.rendered_elements["text"]
'How long did it take to read this sentence?'
>>> item.item_metadata["unit"]
'ms'
>>> item.item_metadata["min_value"]
0
>>> item.item_metadata["max_value"] is None
True
create_magnitude_items_from_texts(texts: list[str], unit: str | None = None, bounds: tuple[int | float | None, int | float | None] = (None, None), prompt: str | None = None, step: int | float | None = None, *, item_template_id: UUID | None = None, metadata_fn: Callable[[str], dict[str, MetadataValue]] | None = None) -> list[Item]
¶
Create magnitude items from a list of texts.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
texts
|
list[str]
|
List of stimulus texts. |
required |
unit
|
str | None
|
Optional unit for all items. |
None
|
bounds
|
tuple[int | float | None, int | float | None]
|
Bounds (min, max) for all items. |
(None, None)
|
prompt
|
str | None
|
The question/prompt for all items. |
None
|
step
|
int | float | None
|
Step size for all items. |
None
|
item_template_id
|
UUID | None
|
Template ID for all created items. If None, generates one per item. |
None
|
metadata_fn
|
Callable[[str], dict[str, MetadataValue]] | None
|
Function to generate metadata from each text. |
None
|
Returns:
| Type | Description |
|---|---|
list[Item]
|
Magnitude items for each text. |
Examples:
create_magnitude_items_from_groups(items: list[Item], group_by: Callable[[Item], Any], unit: str | None = None, bounds: tuple[int | float | None, int | float | None] = (None, None), prompt: str | None = None, step: int | float | None = None, *, extract_text: Callable[[Item], str] | None = None, include_group_metadata: bool = True, item_template_id: UUID | None = None) -> list[Item]
¶
Create magnitude items from grouped source items.
Groups items and creates one magnitude item per source item, preserving group information in metadata.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
items
|
list[Item]
|
Source items to process. |
required |
group_by
|
Callable[[Item], Any]
|
Function to extract grouping key from items. |
required |
unit
|
str | None
|
Optional unit for all items. |
None
|
bounds
|
tuple[int | float | None, int | float | None]
|
Bounds (min, max) for all items. |
(None, None)
|
prompt
|
str | None
|
The question/prompt for all items. |
None
|
step
|
int | float | None
|
Step size for all items. |
None
|
extract_text
|
Callable[[Item], str] | None
|
Function to extract text from item. If None, tries common keys. |
None
|
include_group_metadata
|
bool
|
Whether to include group key in item metadata. |
True
|
item_template_id
|
UUID | None
|
Template ID for all created items. If None, generates one per item. |
None
|
Returns:
| Type | Description |
|---|---|
list[Item]
|
Magnitude items from source items. |
Examples:
>>> source_items = [
... Item(
... uuid4(),
... rendered_elements={"text": "The cat sat."},
... item_metadata={"category": "simple"}
... )
... ]
>>> magnitude_items = create_magnitude_items_from_groups(
... source_items,
... group_by=lambda i: i.item_metadata["category"],
... unit="ms",
... bounds=(0, None),
... prompt="Reading time:"
... )
>>> len(magnitude_items)
1
create_magnitude_items_cross_product(texts: list[str], prompts: list[str], unit: str | None = None, bounds: tuple[int | float | None, int | float | None] = (None, None), step: int | float | None = None, *, item_template_id: UUID | None = None, metadata_fn: Callable[[str, str], dict[str, MetadataValue]] | None = None) -> list[Item]
¶
Create magnitude items from cross-product of texts and prompts.
Useful when you want to apply multiple prompts to each text.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
texts
|
list[str]
|
List of stimulus texts. |
required |
prompts
|
list[str]
|
List of prompts to apply. |
required |
unit
|
str | None
|
Optional unit for all items. |
None
|
bounds
|
tuple[int | float | None, int | float | None]
|
Bounds (min, max) for all items. |
(None, None)
|
step
|
int | float | None
|
Step size for all items. |
None
|
item_template_id
|
UUID | None
|
Template ID for all created items. |
None
|
metadata_fn
|
Callable[[str, str], dict[str, MetadataValue]] | None
|
Function to generate metadata from (text, prompt). |
None
|
Returns:
| Type | Description |
|---|---|
list[Item]
|
Magnitude items from cross-product. |
Examples:
create_filtered_magnitude_items(items: list[Item], unit: str | None = None, bounds: tuple[int | float | None, int | float | None] = (None, None), prompt: str | None = None, step: int | float | None = None, *, item_filter: Callable[[Item], bool] | None = None, extract_text: Callable[[Item], str] | None = None, item_template_id: UUID | None = None) -> list[Item]
¶
Create magnitude items with filtering.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
items
|
list[Item]
|
Source items. |
required |
unit
|
str | None
|
Optional unit for all items. |
None
|
bounds
|
tuple[int | float | None, int | float | None]
|
Bounds (min, max) for all items. |
(None, None)
|
prompt
|
str | None
|
The question/prompt for all items. |
None
|
step
|
int | float | None
|
Step size for all items. |
None
|
item_filter
|
Callable[[Item], bool] | None
|
Filter individual items. |
None
|
extract_text
|
Callable[[Item], str] | None
|
Text extraction function. |
None
|
item_template_id
|
UUID | None
|
Template ID for created items. |
None
|
Returns:
| Type | Description |
|---|---|
list[Item]
|
Filtered magnitude items. |
Examples:
create_reading_time_item(text: str, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item
¶
Create a reading time measurement item.
Convenience function for reading time in milliseconds with a lower bound of 0 (no upper bound).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
The text to measure reading time for. |
required |
item_template_id
|
UUID | None
|
Template ID for the item. If None, generates new UUID. |
None
|
metadata
|
dict[str, MetadataValue] | None
|
Additional metadata for item_metadata field. |
None
|
Returns:
| Type | Description |
|---|---|
Item
|
Reading time magnitude item. |
Examples:
create_confidence_item(text: str, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item
¶
Create a confidence rating item (0-100%).
Convenience function for confidence percentage with bounds (0, 100).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
The text or question to rate confidence for. |
required |
item_template_id
|
UUID | None
|
Template ID for the item. If None, generates new UUID. |
None
|
metadata
|
dict[str, MetadataValue] | None
|
Additional metadata for item_metadata field. |
None
|
Returns:
| Type | Description |
|---|---|
Item
|
Confidence magnitude item. |
Examples:
free_text
¶
Utilities for creating free text experimental items.
This module provides language-agnostic utilities for creating free text items where participants provide open-ended text responses (e.g., paraphrasing, question answering, cloze completion).
Integration Points
- Active Learning: bead/active_learning/models/free_text.py
- Simulation: bead/simulation/strategies/free_text.py
- Deployment: bead/deployment/jspsych/ (text input or textarea)
create_free_text_item(text: str, prompt: str, max_length: int | None = None, validation_pattern: str | None = None, min_length: int | None = None, multiline: bool = False, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item
¶
Create a free text (open-ended) item.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
The stimulus text or context. |
required |
prompt
|
str
|
The question/instruction for what to enter (required). |
required |
max_length
|
int | None
|
Maximum character limit. None means unlimited. |
None
|
validation_pattern
|
str | None
|
Optional regex pattern for validation (validated at deployment). |
None
|
min_length
|
int | None
|
Minimum characters required. None means no minimum. |
None
|
multiline
|
bool
|
True for textarea (multiline), False for single-line input (default). |
False
|
item_template_id
|
UUID | None
|
Template ID for the item. If None, generates new UUID. |
None
|
metadata
|
dict[str, MetadataValue] | None
|
Additional metadata for item_metadata field. |
None
|
Returns:
| Type | Description |
|---|---|
Item
|
Free text item with text and prompt in rendered_elements. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If text or prompt is empty, or if min_length > max_length. |
Examples:
create_free_text_items_from_texts(texts: list[str], prompt: str, max_length: int | None = None, validation_pattern: str | None = None, min_length: int | None = None, multiline: bool = False, *, item_template_id: UUID | None = None, metadata_fn: Callable[[str], dict[str, MetadataValue]] | None = None) -> list[Item]
¶
Create free text items from a list of texts with the same prompt.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
texts
|
list[str]
|
List of stimulus texts. |
required |
prompt
|
str
|
The question/instruction for all items (required). |
required |
max_length
|
int | None
|
Maximum character limit for all items. |
None
|
validation_pattern
|
str | None
|
Optional regex pattern for validation. |
None
|
min_length
|
int | None
|
Minimum characters required. |
None
|
multiline
|
bool
|
True for textarea, False for single-line input. |
False
|
item_template_id
|
UUID | None
|
Template ID for all created items. If None, generates one per item. |
None
|
metadata_fn
|
Callable[[str], dict[str, MetadataValue]] | None
|
Function to generate metadata from each text. |
None
|
Returns:
| Type | Description |
|---|---|
list[Item]
|
Free text items for each text. |
Examples:
>>> texts = ["Sentence 1", "Sentence 2", "Sentence 3"]
>>> items = create_free_text_items_from_texts(
... texts,
... prompt="Paraphrase this:",
... multiline=True,
... max_length=200,
... metadata_fn=lambda t: {"original_length": len(t)}
... )
>>> len(items)
3
>>> items[0].item_metadata["original_length"]
10
create_free_text_items_with_context(contexts: list[str], prompts: list[str], max_length: int | None = None, validation_pattern: str | None = None, min_length: int | None = None, multiline: bool = False, *, item_template_id: UUID | None = None, metadata_fn: Callable[[str, str], dict[str, MetadataValue]] | None = None) -> list[Item]
¶
Create free text items with context + prompt pairs.
Useful for reading comprehension, question answering where each context has a specific question.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
contexts
|
list[str]
|
Context texts (same length as prompts). |
required |
prompts
|
list[str]
|
Prompts/questions for each context. |
required |
max_length
|
int | None
|
Maximum character limit for all items. |
None
|
validation_pattern
|
str | None
|
Optional regex pattern for validation. |
None
|
min_length
|
int | None
|
Minimum characters required. |
None
|
multiline
|
bool
|
True for textarea, False for single-line input. |
False
|
item_template_id
|
UUID | None
|
Template ID for all created items. If None, generates one per item. |
None
|
metadata_fn
|
Callable[[str, str], dict[str, MetadataValue]] | None
|
Function to generate metadata from (context, prompt). |
None
|
Returns:
| Type | Description |
|---|---|
list[Item]
|
Free text items with context + prompt structure. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If contexts and prompts have different lengths. |
Examples:
>>> contexts = ["The cat sat on the mat."]
>>> prompts = ["What sat on the mat?"]
>>> items = create_free_text_items_with_context(
... contexts,
... prompts,
... max_length=50
... )
>>> len(items)
1
>>> items[0].rendered_elements["text"]
'The cat sat on the mat.'
>>> items[0].rendered_elements["prompt"]
'What sat on the mat?'
create_free_text_items_from_groups(items: list[Item], group_by: Callable[[Item], Any], prompt: str, max_length: int | None = None, validation_pattern: str | None = None, min_length: int | None = None, multiline: bool = False, *, extract_text: Callable[[Item], str] | None = None, include_group_metadata: bool = True, item_template_id: UUID | None = None) -> list[Item]
¶
Create free text items from grouped source items.
Groups items and creates one free text item per source item, preserving group information in metadata.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
items
|
list[Item]
|
Source items to process. |
required |
group_by
|
Callable[[Item], Any]
|
Function to extract grouping key from items. |
required |
prompt
|
str
|
The question/instruction for all items (required). |
required |
max_length
|
int | None
|
Maximum character limit. |
None
|
validation_pattern
|
str | None
|
Optional regex pattern for validation. |
None
|
min_length
|
int | None
|
Minimum characters required. |
None
|
multiline
|
bool
|
True for textarea, False for single-line input. |
False
|
extract_text
|
Callable[[Item], str] | None
|
Function to extract text from item. If None, tries common keys. |
None
|
include_group_metadata
|
bool
|
Whether to include group key in item metadata. |
True
|
item_template_id
|
UUID | None
|
Template ID for all created items. If None, generates one per item. |
None
|
Returns:
| Type | Description |
|---|---|
list[Item]
|
Free text items from source items. |
Examples:
>>> source_items = [
... Item(
... uuid4(),
... rendered_elements={"text": "Sentence 1"},
... item_metadata={"type": "simple"}
... )
... ]
>>> free_text_items = create_free_text_items_from_groups(
... source_items,
... group_by=lambda i: i.item_metadata["type"],
... prompt="Paraphrase this:",
... multiline=True
... )
>>> len(free_text_items)
1
create_free_text_items_cross_product(texts: list[str], prompts: list[str], max_length: int | None = None, validation_pattern: str | None = None, min_length: int | None = None, multiline: bool = False, *, item_template_id: UUID | None = None, metadata_fn: Callable[[str, str], dict[str, MetadataValue]] | None = None) -> list[Item]
¶
Create free text items from cross-product of texts and prompts.
Useful when you want to apply multiple prompts to each text.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
texts
|
list[str]
|
List of stimulus texts. |
required |
prompts
|
list[str]
|
List of prompts to apply. |
required |
max_length
|
int | None
|
Maximum character limit for all items. |
None
|
validation_pattern
|
str | None
|
Optional regex pattern for validation. |
None
|
min_length
|
int | None
|
Minimum characters required. |
None
|
multiline
|
bool
|
True for textarea, False for single-line input. |
False
|
item_template_id
|
UUID | None
|
Template ID for all created items. |
None
|
metadata_fn
|
Callable[[str, str], dict[str, MetadataValue]] | None
|
Function to generate metadata from (text, prompt). |
None
|
Returns:
| Type | Description |
|---|---|
list[Item]
|
Free text items from cross-product. |
Examples:
create_filtered_free_text_items(items: list[Item], prompt: str, max_length: int | None = None, validation_pattern: str | None = None, min_length: int | None = None, multiline: bool = False, *, item_filter: Callable[[Item], bool] | None = None, extract_text: Callable[[Item], str] | None = None, item_template_id: UUID | None = None) -> list[Item]
¶
Create free text items with filtering.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
items
|
list[Item]
|
Source items. |
required |
prompt
|
str
|
The question/instruction for all items (required). |
required |
max_length
|
int | None
|
Maximum character limit. |
None
|
validation_pattern
|
str | None
|
Optional regex pattern for validation. |
None
|
min_length
|
int | None
|
Minimum characters required. |
None
|
multiline
|
bool
|
True for textarea, False for single-line input. |
False
|
item_filter
|
Callable[[Item], bool] | None
|
Filter individual items. |
None
|
extract_text
|
Callable[[Item], str] | None
|
Text extraction function. |
None
|
item_template_id
|
UUID | None
|
Template ID for created items. |
None
|
Returns:
| Type | Description |
|---|---|
list[Item]
|
Filtered free text items. |
Examples:
create_paraphrase_item(text: str, instruction: str = 'Rewrite in your own words:', item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item
¶
Create a paraphrase generation item.
Convenience function for paraphrase tasks with multiline input.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
The text to paraphrase. |
required |
instruction
|
str
|
The instruction for paraphrasing (default: "Rewrite in your own words:"). |
'Rewrite in your own words:'
|
item_template_id
|
UUID | None
|
Template ID for the item. If None, generates new UUID. |
None
|
metadata
|
dict[str, MetadataValue] | None
|
Additional metadata for item_metadata field. |
None
|
Returns:
| Type | Description |
|---|---|
Item
|
Paraphrase free text item. |
Examples:
create_wh_question_item(text: str, question_word: str = 'Who', item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item
¶
Create a WH-question answering item.
Convenience function for WH-question answering with short text input.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
The context/passage for the question. |
required |
question_word
|
str
|
The question word to use (default: "Who"). |
'Who'
|
item_template_id
|
UUID | None
|
Template ID for the item. If None, generates new UUID. |
None
|
metadata
|
dict[str, MetadataValue] | None
|
Additional metadata for item_metadata field. |
None
|
Returns:
| Type | Description |
|---|---|
Item
|
WH-question free text item. |
Examples:
cloze
¶
Utilities for creating cloze experimental items.
This module provides language-agnostic utilities for creating cloze items where participants fill in missing words/phrases in partially-filled templates.
SPECIAL: This is the ONLY task type that uses the Item.unfilled_slots field.
Cloze items are unique in that: - They use partially-filled templates with specific slots left blank - UI widgets are inferred from slot constraints at deployment time: - Extensional constraint (finite set) → dropdown - Intensional constraint (rules) → text input with validation - No constraint → free text input - Multiple slots can be unfilled in a single item
Integration Points
- Active Learning: bead/active_learning/models/cloze.py
- Simulation: bead/simulation/strategies/cloze.py
- Deployment: bead/deployment/jspsych/ (dynamic widget generation)
- Resources: bead/resources/template.py (Template and Slot models)
create_cloze_item(template: Any, unfilled_slot_names: list[str], filled_slots: dict[str, str] | None = None, instructions: str | None = None, *, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item
¶
Create a cloze item from a template with specific slots unfilled.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
template
|
Template
|
Source template with slots. |
required |
unfilled_slot_names
|
list[str]
|
Names of slots to leave unfilled (must exist in template.slots). |
required |
filled_slots
|
dict[str, str] | None
|
Pre-filled slots (keys must be valid slot names, disjoint from unfilled). |
None
|
instructions
|
str | None
|
Optional instructions for filling (e.g., "Fill in the verb"). |
None
|
item_template_id
|
UUID | None
|
Template ID for the item. If None, generates new UUID. |
None
|
metadata
|
dict[str, MetadataValue] | None
|
Additional metadata for item_metadata field. |
None
|
Returns:
| Type | Description |
|---|---|
Item
|
Cloze item with unfilled_slots populated. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If unfilled_slot_names not in template, if filled_slots not in template, if unfilled and filled overlap, if no unfilled slots, or if validation fails. |
Examples:
>>> from bead.resources.template import Template, Slot
>>> template = Template(
... name="simple",
... template_string="{det} {noun} {verb}.",
... slots={
... "det": Slot(name="det"),
... "noun": Slot(name="noun"),
... "verb": Slot(name="verb")
... }
... )
>>> item = create_cloze_item(
... template,
... unfilled_slot_names=["verb"],
... filled_slots={"det": "The", "noun": "cat"}
... )
>>> item.rendered_elements["text"]
'The cat ___.'
>>> len(item.unfilled_slots)
1
>>> item.unfilled_slots[0].slot_name
'verb'
>>> item.unfilled_slots[0].position
2
create_cloze_items_from_template(template: Any, n_unfilled: int = 1, strategy: str = 'all_combinations', unfilled_combinations: list[list[str]] | None = None, instructions: str | None = None, *, item_template_id: UUID | None = None, metadata_fn: Callable[[list[str]], dict[str, MetadataValue]] | None = None) -> list[Item]
¶
Create multiple cloze items from a template, varying unfilled slots.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
template
|
Template
|
Source template. |
required |
n_unfilled
|
int
|
Number of slots to leave unfilled per item (default: 1). |
1
|
strategy
|
str
|
How to choose unfilled slots: - 'random': Randomly sample combinations - 'all_combinations': Generate all C(n_slots, n_unfilled) combinations - 'specified': Use provided list |
'all_combinations'
|
unfilled_combinations
|
list[list[str]] | None
|
For strategy='specified', list of slot name combinations to unfill. |
None
|
instructions
|
str | None
|
Instructions for all items. |
None
|
item_template_id
|
UUID | None
|
Template ID for all items. |
None
|
metadata_fn
|
Callable[[list[str]], dict[str, MetadataValue]] | None
|
Generate metadata from unfilled slot names. |
None
|
Returns:
| Type | Description |
|---|---|
list[Item]
|
Cloze items with varying unfilled slots. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If n_unfilled invalid, if strategy='specified' without unfilled_combinations, or if any combination contains invalid slots. |
Examples:
create_simple_cloze_item(text: str, blank_positions: list[int], blank_labels: list[str] | None = None, instructions: str | None = None, *, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item
¶
Create a cloze item from plain text (no template).
Replaces words at specified positions with blanks. This is a simplified helper for creating cloze items without the template infrastructure.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Full text with no blanks. |
required |
blank_positions
|
list[int]
|
Word positions to blank (0-indexed). |
required |
blank_labels
|
list[str] | None
|
Optional labels for blanks (for slot_name field). If None, uses generic labels like "blank_0", "blank_1". |
None
|
instructions
|
str | None
|
Optional instructions. |
None
|
item_template_id
|
UUID | None
|
Template ID for the item. |
None
|
metadata
|
dict[str, MetadataValue] | None
|
Additional metadata. |
None
|
Returns:
| Type | Description |
|---|---|
Item
|
Cloze item with text-based blanks. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If blank_positions out of range or if blank_labels length mismatch. |
Examples:
create_cloze_items_from_groups(items: list[Item], group_by: Callable[[Item], Any], n_slots_to_unfill: int = 1, *, extract_text: Callable[[Item], str] | None = None, include_group_metadata: bool = True, item_template_id: UUID | None = None) -> list[Item]
¶
Create cloze items from grouped source items.
Groups items and creates cloze items from them. If source items have template metadata, uses template-based cloze. Otherwise, falls back to simple text-based cloze.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
items
|
list[Item]
|
Source items to group. |
required |
group_by
|
Callable[[Item], Any]
|
Grouping function. |
required |
n_slots_to_unfill
|
int
|
Number of slots/words to unfill. |
1
|
extract_text
|
Callable[[Item], str] | None
|
Text extraction function. If None, tries common keys. |
None
|
include_group_metadata
|
bool
|
Whether to include group_key in metadata. |
True
|
item_template_id
|
UUID | None
|
Template ID for created items. |
None
|
Returns:
| Type | Description |
|---|---|
list[Item]
|
Cloze items from grouped source items. |
Examples:
create_filtered_cloze_items(templates: list[Any], n_slots_to_unfill: int = 1, *, template_filter: Callable[[Any], bool] | None = None, slot_filter: Callable[[str, Any], bool] | None = None, item_template_id: UUID | None = None) -> list[Item]
¶
Create cloze items with multi-level filtering.
Filters templates and/or slots before creating cloze items.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
templates
|
list[Template]
|
Source templates. |
required |
n_slots_to_unfill
|
int
|
Number of slots to unfill. |
1
|
template_filter
|
Callable[[Template], bool] | None
|
Filter templates. |
None
|
slot_filter
|
Callable[[str, Slot], bool] | None
|
Filter which slots can be unfilled (receives slot_name and Slot object). |
None
|
item_template_id
|
UUID | None
|
Template ID for created items. |
None
|
Returns:
| Type | Description |
|---|---|
list[Item]
|
Filtered cloze items. |
Examples:
Span Annotation Models¶
spans
¶
Core span annotation models.
Provides data models for labeled spans, span segments, span labels, span relations, and span specifications. Supports discontiguous spans, overlapping spans (nested and intersecting), static and interactive modes, and two label sources (fixed sets and Wikidata entity search).
SpanSegment
¶
Bases: BeadBaseModel
Contiguous or discontiguous indices within a single element.
Attributes:
| Name | Type | Description |
|---|---|---|
element_name |
str
|
Which rendered element this segment belongs to. |
indices |
list[int]
|
Token or character indices within the element. |
validate_element_name(v: str) -> str
classmethod
¶
Validate element name is not empty.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
v
|
str
|
Element name to validate. |
required |
Returns:
| Type | Description |
|---|---|
str
|
Validated element name. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If element name is empty. |
validate_indices(v: list[int]) -> list[int]
classmethod
¶
Validate indices are not empty and non-negative.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
v
|
list[int]
|
Indices to validate. |
required |
Returns:
| Type | Description |
|---|---|
list[int]
|
Validated indices. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If indices are empty or contain negative values. |
SpanLabel
¶
Bases: BeadBaseModel
Label applied to a span or relation.
Attributes:
| Name | Type | Description |
|---|---|---|
label |
str
|
Human-readable label text. |
label_id |
str | None
|
External identifier (e.g. Wikidata QID "Q5"). |
confidence |
float | None
|
Confidence score for model-assigned labels. |
validate_label(v: str) -> str
classmethod
¶
Validate label is not empty.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
v
|
str
|
Label to validate. |
required |
Returns:
| Type | Description |
|---|---|
str
|
Validated label. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If label is empty. |
Span
¶
Bases: BeadBaseModel
Labeled span across one or more elements.
Supports discontiguous, overlapping, and nested spans.
Attributes:
| Name | Type | Description |
|---|---|---|
span_id |
str
|
Unique identifier within the item. |
segments |
list[SpanSegment]
|
Index segments composing this span. |
head_index |
int | None
|
Syntactic head token index. |
label |
SpanLabel | None
|
Label applied to this span (None = to-be-labeled). |
span_type |
str | None
|
Semantic category (e.g. "entity", "event", "role"). |
span_metadata |
dict[str, MetadataValue]
|
Additional span-specific metadata. |
validate_span_id(v: str) -> str
classmethod
¶
Validate span_id is not empty.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
v
|
str
|
Span ID to validate. |
required |
Returns:
| Type | Description |
|---|---|
str
|
Validated span ID. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If span_id is empty. |
SpanRelation
¶
Bases: BeadBaseModel
A typed, directed relation between two spans.
Used for semantic role labeling, relation extraction, entity linking, coreference, and similar tasks.
Attributes:
| Name | Type | Description |
|---|---|---|
relation_id |
str
|
Unique identifier within the item. |
source_span_id |
str
|
|
target_span_id |
str
|
|
label |
SpanLabel | None
|
Relation label (reuses SpanLabel for consistency). |
directed |
bool
|
Whether the relation is directed (A->B) or undirected (A--B). |
relation_metadata |
dict[str, MetadataValue]
|
Additional relation-specific metadata. |
validate_relation_id(v: str) -> str
classmethod
¶
Validate relation_id is not empty.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
v
|
str
|
Relation ID to validate. |
required |
Returns:
| Type | Description |
|---|---|
str
|
Validated relation ID. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If relation_id is empty. |
validate_span_ids(v: str) -> str
classmethod
¶
Validate span IDs are not empty.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
v
|
str
|
Span ID to validate. |
required |
Returns:
| Type | Description |
|---|---|
str
|
Validated span ID. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If span ID is empty. |
SpanSpec
¶
Bases: BeadBaseModel
Specification for span labeling behavior.
Configures how spans are displayed, created, and labeled in an experiment. Supports both fixed label sets and Wikidata entity search for both span labels and relation labels.
Attributes:
| Name | Type | Description |
|---|---|---|
index_mode |
SpanIndexMode
|
Whether spans index by token or character position. |
interaction_mode |
SpanInteractionMode
|
"static" for read-only highlights, "interactive" for participant annotation. |
label_source |
LabelSourceType
|
Source of span labels ("fixed" or "wikidata"). |
labels |
list[str] | None
|
Fixed span label set (when label_source is "fixed"). |
label_colors |
dict[str, str] | None
|
CSS colors keyed by label name. |
allow_overlapping |
bool
|
Whether overlapping spans are permitted. |
min_spans |
int | None
|
Minimum number of spans required (interactive mode). |
max_spans |
int | None
|
Maximum number of spans allowed (interactive mode). |
enable_relations |
bool
|
Whether relation annotation is enabled. |
relation_label_source |
LabelSourceType
|
Source of relation labels. |
relation_labels |
list[str] | None
|
Fixed relation label set. |
relation_label_colors |
dict[str, str] | None
|
CSS colors keyed by relation label name. |
relation_directed |
bool
|
Default directionality for new relations. |
min_relations |
int | None
|
Minimum number of relations required (interactive mode). |
max_relations |
int | None
|
Maximum number of relations allowed (interactive mode). |
wikidata_language |
str
|
Language for Wikidata entity search. |
wikidata_entity_types |
list[str] | None
|
Restrict Wikidata search to these entity types. |
wikidata_result_limit |
int
|
Maximum number of Wikidata search results. |
Span Labeling Utilities¶
span_labeling
¶
Utilities for creating span labeling experimental items.
This module provides language-agnostic utilities for creating items with span annotations. Spans can be added to any existing item type (composability) or used as standalone span labeling tasks.
Integration Points
- Active Learning: bead/active_learning/ (via alignment module)
- Deployment: bead/deployment/jspsych/ (span-label plugin)
- Tokenization: bead/tokenization/ (display-level tokens)
tokenize_item(item: Item, tokenizer_config: TokenizerConfig | None = None) -> Item
¶
Tokenize an item's rendered_elements.
Populates tokenized_elements and token_space_after using the
configured tokenizer. Returns a new Item (does not mutate).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
item
|
Item
|
Item to tokenize. |
required |
tokenizer_config
|
TokenizerConfig | None
|
Tokenizer configuration. If None, uses default (spaCy English). |
None
|
Returns:
| Type | Description |
|---|---|
Item
|
New item with populated tokenized_elements and token_space_after. |
create_span_item(text: str, spans: list[Span], prompt: str, tokenizer_config: TokenizerConfig | None = None, tokens: list[str] | None = None, labels: list[str] | None = None, span_spec: SpanSpec | None = None, item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item
¶
Create a standalone span labeling item.
Tokenizes text using config, validates span indices against tokens.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
The stimulus text. |
required |
spans
|
list[Span]
|
Pre-defined span annotations. |
required |
prompt
|
str
|
Question or instruction for the participant. |
required |
tokenizer_config
|
TokenizerConfig | None
|
Tokenizer configuration. Ignored if |
None
|
tokens
|
list[str] | None
|
Pre-tokenized text (overrides tokenizer). |
None
|
labels
|
list[str] | None
|
Fixed label set for span labeling. |
None
|
span_spec
|
SpanSpec | None
|
Span specification. If None, creates a default static spec. |
None
|
item_template_id
|
UUID | None
|
Template ID. If None, generates a new UUID. |
None
|
metadata
|
dict[str, MetadataValue] | None
|
Additional item metadata. |
None
|
Returns:
| Type | Description |
|---|---|
Item
|
Span labeling item. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If text is empty or span indices are out of bounds. |
create_interactive_span_item(text: str, prompt: str, tokenizer_config: TokenizerConfig | None = None, tokens: list[str] | None = None, label_set: list[str] | None = None, label_source: LabelSourceType = 'fixed', item_template_id: UUID | None = None, metadata: dict[str, MetadataValue] | None = None) -> Item
¶
Create an item for interactive span selection by participants.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
The stimulus text. |
required |
prompt
|
str
|
Instruction for the participant. |
required |
tokenizer_config
|
TokenizerConfig | None
|
Tokenizer configuration. |
None
|
tokens
|
list[str] | None
|
Pre-tokenized text (overrides tokenizer). |
None
|
label_set
|
list[str] | None
|
Fixed label set (when label_source is "fixed"). |
None
|
label_source
|
LabelSourceType
|
Label source type ("fixed" or "wikidata"). |
'fixed'
|
item_template_id
|
UUID | None
|
Template ID. If None, generates a new UUID. |
None
|
metadata
|
dict[str, MetadataValue] | None
|
Additional item metadata. |
None
|
Returns:
| Type | Description |
|---|---|
Item
|
Interactive span labeling item (no pre-defined spans). |
add_spans_to_item(item: Item, spans: list[Span], tokenizer_config: TokenizerConfig | None = None, span_spec: SpanSpec | None = None) -> Item
¶
Add span annotations to any existing item.
This is the key composability function: any item (rating, forced choice, binary, etc.) can have spans added as an overlay. Tokenizes rendered_elements if not already tokenized. Returns a new Item.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
item
|
Item
|
Existing item to add spans to. |
required |
spans
|
list[Span]
|
Span annotations to add. |
required |
tokenizer_config
|
TokenizerConfig | None
|
Tokenizer configuration (used only if item lacks tokenization). |
None
|
span_spec
|
SpanSpec | None
|
Span specification. |
None
|
Returns:
| Type | Description |
|---|---|
Item
|
New item with spans added. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If span indices are out of bounds. |
create_span_items_from_texts(texts: list[str], span_extractor: Callable[[str, list[str]], list[Span]], prompt: str, tokenizer_config: TokenizerConfig | None = None, labels: list[str] | None = None, item_template_id: UUID | None = None) -> list[Item]
¶
Batch create span items with automatic tokenization.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
texts
|
list[str]
|
List of stimulus texts. |
required |
span_extractor
|
Callable[[str, list[str]], list[Span]]
|
Function that takes (text, tokens) and returns spans. |
required |
prompt
|
str
|
Question or instruction for the participant. |
required |
tokenizer_config
|
TokenizerConfig | None
|
Tokenizer configuration. |
None
|
labels
|
list[str] | None
|
Fixed label set. |
None
|
item_template_id
|
UUID | None
|
Shared template ID. If None, generates one per item. |
None
|
Returns:
| Type | Description |
|---|---|
list[Item]
|
Span labeling items. |
Item Construction¶
constructor
¶
Item constructor for building experimental items from templates.
This module provides the ItemConstructor class which transforms filled templates into experimental items by applying model-based constraints and collecting model outputs for analysis.
ItemConstructor
¶
Construct experimental items from filled templates.
Transforms filled templates into items by: 1. Resolving element references to text 2. Computing required model outputs (from constraints) 3. Evaluating constraints with model outputs 4. Creating Item instances with metadata
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_registry
|
ModelAdapterRegistry
|
Registry of model adapters for constraint evaluation. |
required |
cache
|
ModelOutputCache
|
Cache for model outputs to avoid redundant computation. |
required |
constraint_resolver
|
ConstraintResolver | None
|
Resolver for evaluating non-model constraints. If None, only model-based constraints can be evaluated. |
None
|
Attributes:
| Name | Type | Description |
|---|---|---|
model_registry |
ModelAdapterRegistry
|
Registry of model adapters for constraint evaluation. |
cache |
ModelOutputCache
|
Cache for model outputs to avoid redundant computation. |
constraint_resolver |
ConstraintResolver | None
|
Resolver for evaluating constraints (not used for model constraints). |
Examples:
>>> from bead.items.adapters.registry import default_registry
>>> from bead.items.cache import ModelOutputCache
>>> cache = ModelOutputCache(backend="memory")
>>> constructor = ItemConstructor(default_registry, cache)
>>> constraints = {constraint_id: constraint_obj}
>>> items = list(constructor.construct_items(
... template, filled_templates, constraints
... ))
construct_items(item_template: ItemTemplate, filled_templates: dict[UUID, FilledTemplate], constraints: dict[UUID, Constraint]) -> Iterator[Item]
¶
Construct items from template and filled templates.
For each combination of filled templates: 1. Render elements (resolve filled_template_ref → text) 2. Compute required model outputs (from constraints) 3. Check constraints using model outputs 4. Yield item if all constraints satisfied
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
item_template
|
ItemTemplate
|
Template defining item structure and constraints. |
required |
filled_templates
|
dict[UUID, FilledTemplate]
|
Map of filled template UUIDs to FilledTemplate instances. |
required |
constraints
|
dict[UUID, Constraint]
|
Map of constraint UUIDs to Constraint objects. |
required |
Yields:
| Type | Description |
|---|---|
Item
|
Constructed items that satisfy all constraints. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If template references missing filled templates or constraints. |
RuntimeError
|
If constraint evaluation or model computation fails. |
Examples:
generation
¶
Utilities for generating cross-product items from templates and lexicons.
This module provides language-agnostic utilities for generating items by combining templates with lexical resources in various patterns.
RELATIONSHIP TO ItemConstructor: - This module (generation.py): Generates cross-product combinations of templates × lexical items BEFORE template filling. Creates lightweight Item objects with just template_id, metadata, and unfilled information. Use when: You want to systematically explore all combinations of a lexical property (e.g., every verb in every frame).
- ItemConstructor (constructor.py): Builds Items FROM ItemTemplates + FilledTemplates with constraint evaluation and model scoring. Takes filled templates and combines them into experimental items with multi-slot constraints checked. Use when: You have filled templates and want to construct experimental items with model-based constraint checking.
These modules are COMPLEMENTARY, not redundant. Typical pipeline: 1. generation.py: Generate cross-product → unfilled item specifications 2. Template filling: Fill template slots → FilledTemplates 3. constructor.py: Construct items → Items with constraints checked
create_cross_product_items(templates: list[Template], lexicons: dict[str, Lexicon], *, cross_product_slot: str = 'verb', metadata_extractor: Callable[[Template, LexicalItem], dict[str, MetadataValue]] | None = None, filter_fn: Callable[[Template, LexicalItem], bool] | None = None) -> Iterator[Item]
¶
Generate cross-product items from templates and lexicons.
Creates an item for each combination of template × lexical item from the specified slot's lexicon. This is useful for systematic exploration of a lexical property (e.g., every verb in every frame).
Items are generated lazily via iterator for memory efficiency with large cross-products.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
templates
|
list[Template]
|
Templates to use for generation. |
required |
lexicons
|
dict[str, Lexicon]
|
Lexicons keyed by slot name. |
required |
cross_product_slot
|
str
|
Slot name to vary across items (default: "verb"). This slot's lexicon will be crossed with all templates. |
'verb'
|
metadata_extractor
|
Callable[[Template, LexicalItem], dict[str, MetadataValue]] | None
|
Optional function to extract metadata from template and lexical item. Receives (template, lexical_item) and returns dict for item_metadata. |
None
|
filter_fn
|
Callable[[Template, LexicalItem], bool] | None
|
Optional filter function. Receives (template, lexical_item) and returns True to include, False to skip. |
None
|
Yields:
| Type | Description |
|---|---|
Item
|
Items representing template × lexical item combinations. |
Examples:
Basic verb × template cross-product:
>>> from uuid import uuid4
>>> templates = [
... Template(
... name="transitive",
... template_string="{subject} {verb} {object}.",
... slots={}
... )
... ]
>>> verb_lex = Lexicon(name="verbs")
>>> verb_lex.add(LexicalItem(lemma="walk"))
>>> verb_lex.add(LexicalItem(lemma="eat"))
>>> lexicons = {"verb": verb_lex}
>>> items = list(create_cross_product_items(templates, lexicons))
>>> len(items)
2
With metadata extraction:
>>> def extract_metadata(template, item):
... return {
... "verb_lemma": item.lemma,
... "template_name": template.name,
... "verb_pos": item.pos
... }
>>> items = list(create_cross_product_items(
... templates,
... lexicons,
... metadata_extractor=extract_metadata
... ))
With filtering:
create_filtered_cross_product_items(templates: list[Template], lexicons: dict[str, Lexicon], *, cross_product_slot: str = 'verb', template_filter: Callable[[Template], bool] | None = None, item_filter: Callable[[LexicalItem], bool] | None = None, combination_filter: Callable[[Template, LexicalItem], bool] | None = None, metadata_extractor: Callable[[Template, LexicalItem], dict[str, MetadataValue]] | None = None) -> Iterator[Item]
¶
Generate cross-product items with multiple filter levels.
Provides separate filters for templates, lexical items, and their combinations, offering more control than the basic cross-product function.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
templates
|
list[Template]
|
Templates to use for generation. |
required |
lexicons
|
dict[str, Lexicon]
|
Lexicons keyed by slot name. |
required |
cross_product_slot
|
str
|
Slot name to vary across items. |
'verb'
|
template_filter
|
Callable[[Template], bool] | None
|
Filter for templates (applied before cross-product). |
None
|
item_filter
|
Callable[[LexicalItem], bool] | None
|
Filter for lexical items (applied before cross-product). |
None
|
combination_filter
|
Callable[[Template, LexicalItem], bool] | None
|
Filter for combinations (applied during generation). |
None
|
metadata_extractor
|
Callable[[Template, LexicalItem], dict[str, MetadataValue]] | None
|
Metadata extraction function. |
None
|
Yields:
| Type | Description |
|---|---|
Item
|
Filtered cross-product items. |
Examples:
Filter at multiple levels:
>>> def template_filter(t):
... return "transitive" in t.name
>>> def item_filter(i):
... return i.pos == "VERB"
>>> def combination_filter(t, i):
... # Only combine if verb is compatible with template
... return True
>>> items = list(create_filtered_cross_product_items(
... templates,
... lexicons,
... template_filter=template_filter,
... item_filter=item_filter,
... combination_filter=combination_filter
... ))
create_stratified_cross_product_items(templates: list[Template], lexicons: dict[str, Lexicon], *, cross_product_slot: str = 'verb', stratify_by: Callable[[LexicalItem], str], items_per_stratum: int, metadata_extractor: Callable[[Template, LexicalItem], dict[str, MetadataValue]] | None = None) -> Iterator[Item]
¶
Generate stratified sample of cross-product items.
Instead of full cross-product, samples a fixed number of lexical items from each stratum (defined by stratify_by function) and crosses them with all templates.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
templates
|
list[Template]
|
Templates to use for generation. |
required |
lexicons
|
dict[str, Lexicon]
|
Lexicons keyed by slot name. |
required |
cross_product_slot
|
str
|
Slot name to vary across items. |
'verb'
|
stratify_by
|
Callable[[LexicalItem], str]
|
Function to extract stratum key from lexical items. |
required |
items_per_stratum
|
int
|
Number of items to sample from each stratum. |
required |
metadata_extractor
|
Callable[[Template, LexicalItem], dict[str, MetadataValue]] | None
|
Metadata extraction function. |
None
|
Yields:
| Type | Description |
|---|---|
Item
|
Stratified cross-product items. |
Examples:
Sample verbs stratified by frequency:
>>> def stratify_by_frequency(item):
... freq = item.attributes.get("frequency", 0)
... if freq > 1000:
... return "high"
... elif freq > 100:
... return "medium"
... else:
... return "low"
>>> items = list(create_stratified_cross_product_items(
... templates,
... lexicons,
... stratify_by=stratify_by_frequency,
... items_per_stratum=10
... ))
items_to_jsonl(items: Iterator[Item], output_path: str, progress_interval: int = 1000) -> int
¶
Write iterator of items to JSONL file with progress tracking.
Utility function for efficient streaming write of large item sets.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
items
|
Iterator[Item]
|
Items to write. |
required |
output_path
|
str
|
Path to output JSONL file. |
required |
progress_interval
|
int
|
Print progress every N items (default: 1000). |
1000
|
Returns:
| Type | Description |
|---|---|
int
|
Number of items written. |
Examples:
Validation and Scoring¶
validation
¶
Validation utilities for constructed items.
This module provides validation functions to ensure constructed items meet all requirements and contain complete, valid data.
validate_item(item: Item, item_template: ItemTemplate) -> list[str]
¶
Validate a constructed item against its template.
Check that the item has all required fields, references valid templates, has consistent constraint satisfaction, and contains valid model outputs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
item
|
Item
|
Item to validate. |
required |
item_template
|
ItemTemplate
|
Template the item was constructed from. |
required |
Returns:
| Type | Description |
|---|---|
list[str]
|
List of validation error messages. Empty list if valid. |
Examples:
validate_model_output(output: ModelOutput) -> list[str]
¶
Validate a model output.
Check that the model output has all required fields and valid values.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
output
|
ModelOutput
|
Model output to validate. |
required |
Returns:
| Type | Description |
|---|---|
list[str]
|
List of validation error messages. Empty list if valid. |
Examples:
validate_constraint_satisfaction(item: Item, item_template: ItemTemplate) -> list[str]
¶
Validate constraint satisfaction consistency.
Check that all constraints in the template have been evaluated and that the results are boolean values.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
item
|
Item
|
Item to validate. |
required |
item_template
|
ItemTemplate
|
Template with constraints. |
required |
Returns:
| Type | Description |
|---|---|
list[str]
|
List of validation error messages. Empty list if valid. |
Examples:
validate_metadata_completeness(item: Item) -> list[str]
¶
Validate that item metadata is complete.
Check that the item has all expected metadata fields populated. Since Item inherits from BeadBaseModel, id, created_at, and modified_at are always present. This function is kept for consistency and future extensibility.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
item
|
Item
|
Item to validate. |
required |
Returns:
| Type | Description |
|---|---|
list[str]
|
List of validation error messages. Empty list if valid. |
Examples:
item_passes_all_constraints(item: Item) -> bool
¶
Check if item satisfies all constraints.
Convenience function to check if all constraints are satisfied.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
item
|
Item
|
Item to check. |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if all constraints satisfied, False otherwise. |
Examples:
get_task_type_requirements(task_type: TaskType) -> dict[str, list[str] | str]
¶
Get validation requirements for a task type.
Returns a dictionary describing the structural requirements for items of the specified task type. Useful for introspection, error messages, and documentation generation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
task_type
|
TaskType
|
Task type to get requirements for. |
required |
Returns:
| Type | Description |
|---|---|
dict
|
Requirements specification with keys: - required_rendered_keys: List of required rendered_elements keys - required_metadata_keys: List of required item_metadata keys - optional_metadata_keys: List of optional item_metadata keys - special_fields: List of special fields (e.g., ["unfilled_slots"]) - description: Human-readable description |
Examples:
validate_item_for_task_type(item: Item, task_type: TaskType) -> bool
¶
Validate that an Item's structure matches requirements for a task type.
Checks that the item has the required rendered_elements keys, item_metadata keys, and special fields for the specified task type. Raises descriptive ValueError if validation fails.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
item
|
Item
|
Item to validate. |
required |
task_type
|
TaskType
|
Expected task type (from bead.items.item_template.TaskType). |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if valid. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If item structure doesn't match task type requirements, with detailed explanation of what's wrong. |
Examples:
infer_task_type_from_item(item: Item) -> TaskType
¶
Infer most likely task type from Item structure.
Examines the item's rendered_elements, item_metadata, and special fields to determine which task type it matches. Uses priority order to handle ambiguous cases.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
item
|
Item
|
Item to infer from. |
required |
Returns:
| Type | Description |
|---|---|
TaskType
|
Inferred task type. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If item structure doesn't match any task type or is ambiguous. |
Examples:
scoring
¶
Abstract base classes for item scoring with language models.
This module provides language-agnostic base classes for scoring items using various metrics (log probability, perplexity, embeddings).
ItemScorer
¶
Bases: ABC
Abstract base class for item scoring.
ItemScorer provides a framework for assigning numeric scores to items based on various criteria (language model probability, acceptability, similarity, etc.).
Examples:
Implementing a custom scorer:
>>> class AcceptabilityScorer(ItemScorer):
... def score(self, item):
... # Score based on some acceptability metric
... text = item.rendered_elements.get("text", "")
... return self._compute_acceptability(text)
...
... def score_batch(self, items):
... return [self.score(item) for item in items]
score(item: Item) -> float
abstractmethod
¶
Compute score for a single item.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
item
|
Item
|
Item to score. |
required |
Returns:
| Type | Description |
|---|---|
float
|
Numeric score for the item. |
score_batch(items: list[Item]) -> list[float]
¶
Compute scores for multiple items.
Default implementation calls score() for each item sequentially. Subclasses can override for batch processing optimization.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
items
|
list[Item]
|
Items to score. |
required |
Returns:
| Type | Description |
|---|---|
list[float]
|
Scores for each item. |
Examples:
score_with_metadata(items: list[Item]) -> dict[UUID, dict[str, float | str]]
¶
Score items and return results with metadata.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
items
|
list[Item]
|
Items to score. |
required |
Returns:
| Type | Description |
|---|---|
dict[UUID, dict[str, float | str]]
|
Dictionary mapping item UUIDs to score dictionaries. Each score dict contains at least a "score" key. |
Examples:
LanguageModelScorer
¶
Bases: ItemScorer
Scorer using language model log probabilities.
Scores items based on their log probability under a language model. Uses HuggingFace adapters for model inference and supports caching.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
str
|
HuggingFace model identifier (e.g., "gpt2", "gpt2-medium"). |
required |
cache_dir
|
Path | str | None
|
Directory for caching model outputs. If None, no caching. |
None
|
device
|
str
|
Device to run model on ("cpu", "cuda", "mps"). |
'cpu'
|
text_key
|
str
|
Key in item.rendered_elements to use as text (default: "text"). |
'text'
|
model_version
|
str
|
Version string for cache tracking. |
'unknown'
|
Examples:
>>> from pathlib import Path
>>> scorer = LanguageModelScorer(
... model_name="gpt2",
... cache_dir=Path(".cache"),
... device="cpu"
... )
>>> score = scorer.score(item)
>>> score < 0 # Log probabilities are negative
True
model: HuggingFaceLanguageModel
property
¶
Get the model, loading if necessary.
Returns:
| Type | Description |
|---|---|
HuggingFaceLanguageModel
|
The language model adapter. |
score(item: Item) -> float
¶
Compute log probability score for an item.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
item
|
Item
|
Item to score. |
required |
Returns:
| Type | Description |
|---|---|
float
|
Log probability of the item's text under the language model. |
Raises:
| Type | Description |
|---|---|
KeyError
|
If text_key not found in item.rendered_elements. |
score_batch(items: list[Item], batch_size: int | None = None) -> list[float]
¶
Compute scores for multiple items efficiently using batched inference.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
items
|
list[Item]
|
Items to score. |
required |
batch_size
|
int | None
|
Number of items to process in each batch. If None, automatically infers optimal batch size based on available resources. |
None
|
Returns:
| Type | Description |
|---|---|
list[float]
|
Log probabilities for each item. |
score_with_metadata(items: list[Item]) -> dict[UUID, dict[str, float | str]]
¶
Score items and return results with additional metrics.
Returns log probability and perplexity for each item.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
items
|
list[Item]
|
Items to score. |
required |
Returns:
| Type | Description |
|---|---|
dict[UUID, dict[str, float | str]]
|
Dictionary with "score" (log prob) and "perplexity" for each item. |
ForcedChoiceScorer
¶
Bases: ItemScorer
Scorer for N-AFC (forced-choice) items with multiple options.
Computes comparison scores for forced-choice items by scoring each option and applying a comparison function (e.g., max difference, variance, entropy).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
base_scorer
|
ItemScorer
|
Base scorer to use for individual options. |
required |
comparison_fn
|
callable | None
|
Function that takes list of scores and returns comparison metric. Default is standard deviation (variance in scores). |
None
|
option_prefix
|
str
|
Prefix for option names in rendered_elements (default: "option"). |
'option'
|
Examples:
>>> base = LanguageModelScorer("gpt2", device="cpu")
>>> fc_scorer = ForcedChoiceScorer(
... base_scorer=base,
... comparison_fn=lambda scores: max(scores) - min(scores) # Range
... )
>>> # Item with option_a, option_b, option_c, ...
>>> score = fc_scorer.score(forced_choice_item)
score(item: Item) -> float
¶
Score a forced-choice item.
Extracts all options from item.rendered_elements (option_a, option_b, ...), scores each option, and applies comparison function.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
item
|
Item
|
Forced-choice item with multiple options. |
required |
Returns:
| Type | Description |
|---|---|
float
|
Comparison score across all options. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If item doesn't contain option elements or has precomputed scores. |
Model Output Cache¶
cache
¶
Content-addressable cache for judgment model outputs.
This module provides caching infrastructure for model outputs during item construction. It supports multiple backends (filesystem, in-memory) and various operation types including log probabilities, NLI scores, embeddings, and similarity metrics.
Note: This cache is distinct from bead.templates.adapters.cache, which handles MLM predictions for template filling. This module caches judgment model outputs used in item construction.
CacheBackend
¶
Bases: ABC
Abstract base class for cache backends.
Defines the interface that all cache backends must implement.
get(key: str) -> dict[str, object] | None
abstractmethod
¶
Retrieve cache entry by key.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
str
|
Cache key to retrieve. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, object] | None
|
Cache entry data if found, None otherwise. |
set(key: str, data: dict[str, object]) -> None
abstractmethod
¶
Store cache entry with key.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
str
|
Cache key. |
required |
data
|
dict[str, object]
|
Cache entry data to store. |
required |
delete(key: str) -> None
abstractmethod
¶
Delete cache entry by key.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
str
|
Cache key to delete. |
required |
clear() -> None
abstractmethod
¶
Clear all cache entries.
keys() -> list[str]
abstractmethod
¶
Return all cache keys.
Returns:
| Type | Description |
|---|---|
list[str]
|
List of all cache keys in the backend. |
FilesystemBackend
¶
Bases: CacheBackend
Filesystem-based cache backend.
Stores each cache entry as a separate JSON file with the cache key as the filename.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cache_dir
|
Path
|
Directory for cache storage. |
required |
Attributes:
| Name | Type | Description |
|---|---|---|
cache_dir |
Path
|
Directory where cache files are stored. |
Examples:
>>> from pathlib import Path
>>> backend = FilesystemBackend(cache_dir=Path(".cache"))
>>> backend.set("abc123", {"result": 42})
>>> backend.get("abc123")
{'result': 42}
get(key: str) -> dict[str, object] | None
¶
Retrieve cache entry from filesystem.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
str
|
Cache key. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, object] | None
|
Cache entry data if found, None otherwise. |
set(key: str, data: dict[str, object]) -> None
¶
Store cache entry to filesystem.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
str
|
Cache key. |
required |
data
|
dict[str, object]
|
Cache entry data. |
required |
delete(key: str) -> None
¶
Delete cache entry from filesystem.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
str
|
Cache key to delete. |
required |
clear() -> None
¶
Clear all cache entries from filesystem.
keys() -> list[str]
¶
Return all cache keys from filesystem.
Returns:
| Type | Description |
|---|---|
list[str]
|
List of cache keys (filenames without .json extension). |
InMemoryBackend
¶
Bases: CacheBackend
In-memory cache backend.
Stores cache entries in a dictionary. No persistence across program runs. Useful for testing and temporary caching scenarios.
Examples:
>>> backend = InMemoryBackend()
>>> backend.set("xyz789", {"result": 3.14})
>>> backend.get("xyz789")
{'result': 3.14}
get(key: str) -> dict[str, object] | None
¶
Retrieve cache entry from memory.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
str
|
Cache key. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, object] | None
|
Cache entry data if found, None otherwise. |
set(key: str, data: dict[str, object]) -> None
¶
Store cache entry in memory.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
str
|
Cache key. |
required |
data
|
dict[str, object]
|
Cache entry data. |
required |
delete(key: str) -> None
¶
Delete cache entry from memory.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
str
|
Cache key to delete. |
required |
clear() -> None
¶
Clear all cache entries from memory.
keys() -> list[str]
¶
Return all cache keys from memory.
Returns:
| Type | Description |
|---|---|
list[str]
|
List of cache keys. |
ModelOutputCache
¶
Content-addressable cache for judgment model outputs.
Caches results from various model operations to avoid redundant computation. Supports multiple operation types including log probabilities, perplexity, NLI scores, embeddings, and similarity metrics.
Cache keys are automatically generated using SHA-256 hashing of the model name, operation type, and all input parameters, ensuring deterministic cache hits for identical inputs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cache_dir
|
Path | None
|
Directory for cache files (filesystem backend only). Defaults to ~/.cache/bead/models if not specified. |
None
|
backend
|
('filesystem', 'memory')
|
Cache backend type. "filesystem" persists across runs, "memory" is ephemeral. |
"filesystem"
|
enabled
|
bool
|
Whether caching is enabled. |
True
|
Attributes:
| Name | Type | Description |
|---|---|---|
enabled |
bool
|
Whether caching is enabled. When False, all operations are no-ops. |
Examples:
Basic usage with filesystem backend:
>>> from pathlib import Path
>>> cache = ModelOutputCache(cache_dir=Path(".cache"))
>>> result = cache.get("gpt2", "log_probability", text="Hello world")
>>> if result is None:
... result = -2.5
... cache.set("gpt2", "log_probability", result, text="Hello world")
Caching NLI scores:
>>> nli_scores = cache.get("roberta-nli", "nli",
... premise="Mary loves books",
... hypothesis="Mary enjoys reading")
>>> if nli_scores is None:
... nli_scores = {"entailment": 0.9, "neutral": 0.08, "contradiction": 0.02}
... cache.set("roberta-nli", "nli", nli_scores,
... premise="Mary loves books", hypothesis="Mary enjoys reading")
Caching embeddings:
>>> import numpy as np
>>> embedding = cache.get("bert-base", "embedding", text="Hello")
>>> if embedding is None:
... embedding = np.random.rand(768)
... cache.set("bert-base", "embedding", embedding, text="Hello")
generate_cache_key(model_name: str, operation: str, **inputs: str | int | float | bool | None) -> str
¶
Generate deterministic cache key from inputs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
str
|
Model identifier. |
required |
operation
|
str
|
Operation type (e.g., "log_probability", "embedding"). |
required |
**inputs
|
str | int | float | bool | None
|
Input parameters for the operation (text, premise, hypothesis). |
{}
|
Returns:
| Type | Description |
|---|---|
str
|
SHA-256 hex digest as cache key. |
get(model_name: str, operation: str, **inputs: str | int | float | bool | None) -> Any
¶
Retrieve cached result.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
str
|
Model identifier. |
required |
operation
|
str
|
Operation type (e.g., "log_probability", "nli", "embedding"). |
required |
**inputs
|
str | int | float | bool | None
|
Input parameters for the operation (text, premise, hypothesis). |
{}
|
Returns:
| Type | Description |
|---|---|
Any
|
Cached result if found, None otherwise. |
set(model_name: str, operation: str, result: float | dict[str, float] | list[float] | np.ndarray, model_version: str | None = None, **inputs: str | int | float | bool | None) -> None
¶
Store result in cache.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
str
|
Model identifier. |
required |
operation
|
str
|
Operation type (e.g., "log_probability", "nli", "embedding"). |
required |
result
|
float | dict[str, float] | list[float] | ndarray
|
Result to cache (log probability, NLI scores, embedding, etc.). |
required |
model_version
|
str | None
|
Optional model version string for tracking. |
None
|
**inputs
|
str | int | float | bool | None
|
Input parameters for the operation (text, premise, hypothesis). |
{}
|
invalidate(model_name: str, operation: str, **inputs: str | int | float | bool | None) -> None
¶
Invalidate specific cache entry.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
str
|
Model identifier. |
required |
operation
|
str
|
Operation type. |
required |
**inputs
|
str | int | float | bool | None
|
Input parameters for the operation. |
{}
|
clear_model(model_name: str) -> None
¶
Clear all cache entries for a specific model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
str
|
Model identifier. |
required |
clear() -> None
¶
Clear all cache entries.
Model Adapters¶
base
¶
Base class for model adapters used in item construction.
This module defines the abstract ModelAdapter interface that all model adapters must implement to support judgment prediction operations during Stage 3 (Item Construction).
This is SEPARATE from template filling model adapters (bead.templates.models.adapter), which are used in Stage 2.
ModelAdapter
¶
Bases: ABC
Base class for model adapters used in item construction.
All model adapters must implement this interface to support judgment prediction operations during Stage 3 (Item Construction).
This is SEPARATE from template filling model adapters (bead.templates.models.adapter), which are used in Stage 2.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
str
|
Model identifier (e.g., "gpt2", "roberta-large-mnli"). |
required |
cache
|
ModelOutputCache
|
Cache instance for storing model outputs. |
required |
model_version
|
str
|
Version of the model for cache tracking. |
'unknown'
|
Attributes:
| Name | Type | Description |
|---|---|---|
model_name |
str
|
Model identifier (e.g., "gpt2", "roberta-large-mnli"). |
model_version |
str
|
Version of the model. |
cache |
ModelOutputCache
|
Cache for model outputs. |
compute_log_probability(text: str) -> float
abstractmethod
¶
Compute log probability of text under language model.
Required for language model constraints. Should raise NotImplementedError if not supported by model type.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Text to compute log probability for. |
required |
Returns:
| Type | Description |
|---|---|
float
|
Log probability of the text. |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
If this operation is not supported by the model type. |
compute_perplexity(text: str) -> float
abstractmethod
¶
Compute perplexity of text.
Required for complexity-based filtering. Should raise NotImplementedError if not supported by model type.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Text to compute perplexity for. |
required |
Returns:
| Type | Description |
|---|---|
float
|
Perplexity of the text (must be positive). |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
If this operation is not supported by the model type. |
get_embedding(text: str) -> np.ndarray[tuple[int, ...], np.dtype[np.float64]]
abstractmethod
¶
Get embedding vector for text.
Required for similarity computations and semantic clustering. Should raise NotImplementedError if not supported by model type.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Text to embed. |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
Embedding vector for the text. |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
If this operation is not supported by the model type. |
compute_nli(premise: str, hypothesis: str) -> dict[str, float]
abstractmethod
¶
Compute natural language inference scores.
Must return dict with keys: "entailment", "neutral", "contradiction". Required for inference-based constraints. Should raise NotImplementedError if not supported by model type.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
premise
|
str
|
Premise text. |
required |
hypothesis
|
str
|
Hypothesis text. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, float]
|
Dictionary with keys "entailment", "neutral", "contradiction" mapping to probability scores that sum to ~1.0. |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
If this operation is not supported by the model type. |
compute_similarity(text1: str, text2: str) -> float
¶
Compute similarity between two texts.
Default implementation using cosine similarity of embeddings. Can be overridden for specialized similarity computation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text1
|
str
|
First text. |
required |
text2
|
str
|
Second text. |
required |
Returns:
| Type | Description |
|---|---|
float
|
Similarity score in [-1, 1] (cosine similarity). |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
If embeddings are not supported by the model type. |
get_nli_label(premise: str, hypothesis: str) -> str
¶
Get predicted NLI label (max score).
Default implementation using argmax over compute_nli() scores.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
premise
|
str
|
Premise text. |
required |
hypothesis
|
str
|
Hypothesis text. |
required |
Returns:
| Type | Description |
|---|---|
str
|
Predicted label: "entailment", "neutral", or "contradiction". |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
If NLI is not supported by the model type. |
huggingface
¶
HuggingFace model adapters for language models and NLI.
This module provides adapters for HuggingFace Transformers models: - HuggingFaceLanguageModel: Causal LMs (GPT-2, GPT-Neo, Llama, Mistral) - HuggingFaceMaskedLanguageModel: Masked LMs (BERT, RoBERTa, ALBERT) - HuggingFaceNLI: NLI models (RoBERTa-MNLI, DeBERTa-MNLI, BART-MNLI)
HuggingFaceLanguageModel
¶
Bases: HuggingFaceAdapterMixin, ModelAdapter
Adapter for HuggingFace causal language models.
Supports models like GPT-2, GPT-Neo, Llama, Mistral, and other autoregressive (left-to-right) language models.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
str
|
HuggingFace model identifier (e.g., "gpt2", "gpt2-medium"). |
required |
cache
|
ModelOutputCache
|
Cache instance for storing model outputs. |
required |
device
|
('cpu', 'cuda', 'mps')
|
Device to run model on. Falls back to CPU if device unavailable. |
"cpu"
|
model_version
|
str
|
Version string for cache tracking. |
'unknown'
|
Examples:
>>> from pathlib import Path
>>> from bead.items.cache import ModelOutputCache
>>> cache = ModelOutputCache(cache_dir=Path(".cache"))
>>> model = HuggingFaceLanguageModel("gpt2", cache, device="cpu")
>>> log_prob = model.compute_log_probability("The cat sat on the mat.")
>>> perplexity = model.compute_perplexity("The cat sat on the mat.")
>>> embedding = model.get_embedding("The cat sat on the mat.")
model: PreTrainedModel
property
¶
Get the model, loading if necessary.
tokenizer: PreTrainedTokenizerBase
property
¶
Get the tokenizer, loading if necessary.
compute_log_probability(text: str) -> float
¶
Compute log probability of text under language model.
Uses the model's loss with labels=input_ids to compute the negative log-likelihood of the text.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Text to compute log probability for. |
required |
Returns:
| Type | Description |
|---|---|
float
|
Log probability of the text. |
compute_log_probability_batch(texts: list[str], batch_size: int | None = None) -> list[float]
¶
Compute log probabilities for multiple texts efficiently.
Uses batched tokenization and inference for significant speedup. Checks cache before computing, only processes uncached texts.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
texts
|
list[str]
|
Texts to compute log probabilities for. |
required |
batch_size
|
int | None
|
Number of texts to process in each batch. If None, automatically infers optimal batch size based on available device memory and model size. |
None
|
Returns:
| Type | Description |
|---|---|
list[float]
|
Log probabilities for each text, in the same order as input. |
Examples:
compute_perplexity(text: str) -> float
¶
Compute perplexity of text.
Perplexity is exp(average negative log-likelihood per token).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Text to compute perplexity for. |
required |
Returns:
| Type | Description |
|---|---|
float
|
Perplexity of the text (positive value). |
get_embedding(text: str) -> np.ndarray
¶
Get embedding vector for text.
Uses mean pooling of last hidden states as the text embedding.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Text to embed. |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
Embedding vector for the text. |
compute_nli(premise: str, hypothesis: str) -> dict[str, float]
¶
Compute natural language inference scores.
Not supported for causal language models.
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
Always raised, as causal LMs don't support NLI directly. |
HuggingFaceMaskedLanguageModel
¶
Bases: HuggingFaceAdapterMixin, ModelAdapter
Adapter for HuggingFace masked language models.
Supports models like BERT, RoBERTa, ALBERT, and other masked language models (MLMs).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
str
|
HuggingFace model identifier (e.g., "bert-base-uncased"). |
required |
cache
|
ModelOutputCache
|
Cache instance for storing model outputs. |
required |
device
|
('cpu', 'cuda', 'mps')
|
Device to run model on. Falls back to CPU if device unavailable. |
"cpu"
|
model_version
|
str
|
Version string for cache tracking. |
'unknown'
|
Examples:
>>> from pathlib import Path
>>> from bead.items.cache import ModelOutputCache
>>> cache = ModelOutputCache(cache_dir=Path(".cache"))
>>> model = HuggingFaceMaskedLanguageModel("bert-base-uncased", cache)
>>> log_prob = model.compute_log_probability("The cat sat on the mat.")
>>> embedding = model.get_embedding("The cat sat on the mat.")
model: PreTrainedModel
property
¶
Get the model, loading if necessary.
tokenizer: PreTrainedTokenizerBase
property
¶
Get the tokenizer, loading if necessary.
compute_log_probability(text: str) -> float
¶
Compute log probability of text using pseudo-log-likelihood.
For MLMs, we use pseudo-log-likelihood: mask each token one at a time and sum the log probabilities of predicting each token.
This is computationally expensive - caching is critical.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Text to compute log probability for. |
required |
Returns:
| Type | Description |
|---|---|
float
|
Pseudo-log-probability of the text. |
compute_perplexity(text: str) -> float
¶
Compute perplexity based on pseudo-log-likelihood.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Text to compute perplexity for. |
required |
Returns:
| Type | Description |
|---|---|
float
|
Perplexity of the text (positive value). |
get_embedding(text: str) -> np.ndarray
¶
Get embedding vector for text.
Uses the [CLS] token embedding from the last layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Text to embed. |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
Embedding vector for the text. |
compute_nli(premise: str, hypothesis: str) -> dict[str, float]
¶
Compute natural language inference scores.
Not supported for masked language models.
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
Always raised, as MLMs don't support NLI directly. |
HuggingFaceNLI
¶
Bases: HuggingFaceAdapterMixin, ModelAdapter
Adapter for HuggingFace NLI models.
Supports NLI models trained on MNLI and similar datasets (e.g., "roberta-large-mnli", "microsoft/deberta-base-mnli").
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
str
|
HuggingFace model identifier for NLI model. |
required |
cache
|
ModelOutputCache
|
Cache instance for storing model outputs. |
required |
device
|
('cpu', 'cuda', 'mps')
|
Device to run model on. Falls back to CPU if device unavailable. |
"cpu"
|
model_version
|
str
|
Version string for cache tracking. |
'unknown'
|
Examples:
>>> from pathlib import Path
>>> from bead.items.cache import ModelOutputCache
>>> cache = ModelOutputCache(cache_dir=Path(".cache"))
>>> nli = HuggingFaceNLI("roberta-large-mnli", cache, device="cpu")
>>> scores = nli.compute_nli(
... premise="Mary loves reading books.",
... hypothesis="Mary enjoys literature."
... )
>>> label = nli.get_nli_label(
... premise="Mary loves reading books.",
... hypothesis="Mary enjoys literature."
... )
model: PreTrainedModel
property
¶
Get the model, loading if necessary.
tokenizer: PreTrainedTokenizerBase
property
¶
Get the tokenizer, loading if necessary.
compute_log_probability(text: str) -> float
¶
Compute log probability of text.
Not supported for NLI models.
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
Always raised, as NLI models don't provide log probabilities. |
compute_perplexity(text: str) -> float
¶
Compute perplexity of text.
Not supported for NLI models.
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
Always raised, as NLI models don't provide perplexity. |
get_embedding(text: str) -> np.ndarray
¶
Get embedding vector for text.
Uses the model's encoder to get embeddings. Note that NLI models are typically fine-tuned for classification, so embeddings may not be optimal for general similarity tasks.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Text to embed. |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
Embedding vector for the text. |
compute_nli(premise: str, hypothesis: str) -> dict[str, float]
¶
Compute natural language inference scores.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
premise
|
str
|
Premise text. |
required |
hypothesis
|
str
|
Hypothesis text. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, float]
|
Dictionary with keys "entailment", "neutral", "contradiction" mapping to probability scores that sum to ~1.0. |
openai
¶
OpenAI API adapter for item construction.
This module provides a ModelAdapter implementation for OpenAI's API, supporting GPT models for various NLP tasks including log probability computation, embeddings, and natural language inference via prompting.
OpenAIAdapter
¶
Bases: ModelAdapter
Adapter for OpenAI API models.
Provides access to OpenAI's GPT models for language model operations, embeddings, and prompted natural language inference.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
str
|
OpenAI model identifier (default: "gpt-3.5-turbo"). |
'gpt-3.5-turbo'
|
api_key
|
str | None
|
OpenAI API key. If None, uses OPENAI_API_KEY environment variable. |
None
|
cache
|
ModelOutputCache | None
|
Cache for model outputs. If None, creates in-memory cache. |
None
|
model_version
|
str
|
Model version for cache tracking (default: "latest"). |
'latest'
|
embedding_model
|
str
|
Model to use for embeddings (default: "text-embedding-ada-002"). |
'text-embedding-ada-002'
|
Attributes:
| Name | Type | Description |
|---|---|---|
model_name |
str
|
OpenAI model identifier (e.g., "gpt-3.5-turbo", "gpt-4"). |
client |
OpenAI
|
OpenAI API client. |
embedding_model |
str
|
Model to use for embeddings (default: "text-embedding-ada-002"). |
Raises:
| Type | Description |
|---|---|
ValueError
|
If no API key is provided and OPENAI_API_KEY is not set. |
compute_log_probability(text: str) -> float
¶
Compute log probability of text using OpenAI completions API.
Uses the completions API with logprobs to get token-level log probabilities and sums them to get the total log probability.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Text to compute log probability for. |
required |
Returns:
| Type | Description |
|---|---|
float
|
Log probability of the text (sum of token log probabilities). |
compute_perplexity(text: str) -> float
¶
Compute perplexity of text.
Perplexity is computed as exp(-log_prob / num_tokens).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Text to compute perplexity for. |
required |
Returns:
| Type | Description |
|---|---|
float
|
Perplexity of the text (must be positive). |
get_embedding(text: str) -> np.ndarray
¶
Get embedding vector for text using OpenAI embeddings API.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Text to embed. |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
Embedding vector for the text. |
compute_nli(premise: str, hypothesis: str) -> dict[str, float]
¶
Compute natural language inference scores via prompting.
Uses chat completions API with a prompt to classify the relationship between premise and hypothesis.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
premise
|
str
|
Premise text. |
required |
hypothesis
|
str
|
Hypothesis text. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, float]
|
Dictionary with keys "entailment", "neutral", "contradiction" mapping to probability scores. |
anthropic
¶
Anthropic API adapter for item construction.
This module provides a ModelAdapter implementation for Anthropic's Claude API, supporting natural language inference via prompting. Note that Claude API does not provide direct access to log probabilities or embeddings.
AnthropicAdapter
¶
Bases: ModelAdapter
Adapter for Anthropic Claude API models.
Provides access to Claude models for prompted natural language inference. Note that Claude API does not support log probability computation or embeddings, so those methods will raise NotImplementedError.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
str
|
Claude model identifier (default: "claude-3-5-sonnet-20241022"). |
'claude-3-5-sonnet-20241022'
|
api_key
|
str | None
|
Anthropic API key. If None, uses ANTHROPIC_API_KEY environment variable. |
None
|
cache
|
ModelOutputCache | None
|
Cache for model outputs. If None, creates in-memory cache. |
None
|
model_version
|
str
|
Model version for cache tracking (default: "latest"). |
'latest'
|
Attributes:
| Name | Type | Description |
|---|---|---|
model_name |
str
|
Claude model identifier (e.g., "claude-3-5-sonnet-20241022"). |
client |
Anthropic
|
Anthropic API client. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If no API key is provided and ANTHROPIC_API_KEY is not set. |
compute_log_probability(text: str) -> float
¶
Compute log probability of text.
Not supported by Anthropic API.
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
Always raised - Claude API does not provide log probabilities. |
compute_perplexity(text: str) -> float
¶
Compute perplexity of text.
Not supported by Anthropic API (requires log probabilities).
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
Always raised - requires log probability support. |
get_embedding(text: str) -> np.ndarray
¶
Get embedding vector for text.
Not supported by Anthropic API.
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
Always raised - Claude API does not provide embeddings. |
compute_nli(premise: str, hypothesis: str) -> dict[str, float]
¶
Compute natural language inference scores via prompting.
Uses Claude's messages API with a prompt to classify the relationship between premise and hypothesis.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
premise
|
str
|
Premise text. |
required |
hypothesis
|
str
|
Hypothesis text. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, float]
|
Dictionary with keys "entailment", "neutral", "contradiction" mapping to probability scores. |
google
¶
Google Generative AI adapter for item construction.
This module provides a ModelAdapter implementation for Google's Generative AI models (Gemini), supporting natural language inference via prompting and embeddings. Note that Gemini API does not provide direct access to log probabilities.
GoogleAdapter
¶
Bases: ModelAdapter
Adapter for Google Generative AI models (Gemini).
Provides access to Gemini models for natural language inference and embeddings. Note that Gemini API does not support log probability computation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
str
|
Gemini model identifier (default: "gemini-pro"). |
'gemini-pro'
|
api_key
|
str | None
|
Google API key. If None, uses GOOGLE_API_KEY environment variable. |
None
|
cache
|
ModelOutputCache | None
|
Cache for model outputs. If None, creates in-memory cache. |
None
|
model_version
|
str
|
Model version for cache tracking (default: "latest"). |
'latest'
|
embedding_model
|
str
|
Model to use for embeddings (default: "models/embedding-001"). |
'models/embedding-001'
|
Attributes:
| Name | Type | Description |
|---|---|---|
model_name |
str
|
Gemini model identifier (e.g., "gemini-pro"). |
model |
GenerativeModel
|
Google Generative AI model instance. |
embedding_model |
str
|
Model to use for embeddings (default: "models/embedding-001"). |
Raises:
| Type | Description |
|---|---|
ValueError
|
If no API key is provided and GOOGLE_API_KEY is not set. |
compute_log_probability(text: str) -> float
¶
Compute log probability of text.
Not supported by Google Generative AI API.
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
Always raised - Gemini API does not provide log probabilities. |
compute_perplexity(text: str) -> float
¶
Compute perplexity of text.
Not supported by Google Generative AI API (requires log probabilities).
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
Always raised - requires log probability support. |
get_embedding(text: str) -> np.ndarray
¶
Get embedding vector for text using Google's embedding model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Text to embed. |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
Embedding vector for the text. |
compute_nli(premise: str, hypothesis: str) -> dict[str, float]
¶
Compute natural language inference scores via prompting.
Uses Gemini's generation API with a prompt to classify the relationship between premise and hypothesis.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
premise
|
str
|
Premise text. |
required |
hypothesis
|
str
|
Hypothesis text. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, float]
|
Dictionary with keys "entailment", "neutral", "contradiction" mapping to probability scores. |
togetherai
¶
Together AI adapter for item construction.
This module provides a ModelAdapter implementation for Together AI's API, which provides access to various open-source models. Together AI uses an OpenAI-compatible API, so we use the OpenAI client with a custom base URL.
TogetherAIAdapter
¶
Bases: ModelAdapter
Adapter for Together AI models.
Together AI provides access to various open-source models through an OpenAI-compatible API. This adapter uses the OpenAI client with a custom base URL.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
str
|
Together AI model identifier (default: "meta-llama/Llama-3-70b-chat-hf"). |
'meta-llama/Llama-3-70b-chat-hf'
|
api_key
|
str | None
|
Together AI API key. If None, uses TOGETHER_API_KEY environment variable. |
None
|
cache
|
ModelOutputCache | None
|
Cache for model outputs. If None, creates in-memory cache. |
None
|
model_version
|
str
|
Model version for cache tracking (default: "latest"). |
'latest'
|
Attributes:
| Name | Type | Description |
|---|---|---|
model_name |
str
|
Together AI model identifier (e.g., "meta-llama/Llama-3-70b-chat-hf"). |
client |
OpenAI
|
OpenAI-compatible client configured for Together AI. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If no API key is provided and TOGETHER_API_KEY is not set. |
compute_log_probability(text: str) -> float
¶
Compute log probability of text using Together AI API.
Uses the completions API with logprobs to get token-level log probabilities and sums them to get the total log probability.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Text to compute log probability for. |
required |
Returns:
| Type | Description |
|---|---|
float
|
Log probability of the text (sum of token log probabilities). |
compute_perplexity(text: str) -> float
¶
Compute perplexity of text.
Perplexity is computed as exp(-log_prob / num_tokens).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Text to compute perplexity for. |
required |
Returns:
| Type | Description |
|---|---|
float
|
Perplexity of the text (must be positive). |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
If log probability computation is not supported. |
get_embedding(text: str) -> np.ndarray
¶
Get embedding vector for text.
Not supported by Together AI (no embedding-specific models).
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
Always raised - Together AI does not provide embeddings. |
compute_nli(premise: str, hypothesis: str) -> dict[str, float]
¶
Compute natural language inference scores via prompting.
Uses chat completions API with a prompt to classify the relationship between premise and hypothesis.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
premise
|
str
|
Premise text. |
required |
hypothesis
|
str
|
Hypothesis text. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, float]
|
Dictionary with keys "entailment", "neutral", "contradiction" mapping to probability scores. |
sentence_transformers
¶
Sentence transformer adapter for semantic embeddings.
This module provides an adapter for sentence-transformers models, which are optimized for generating sentence embeddings for semantic similarity tasks.
HuggingFaceSentenceTransformer
¶
Bases: ModelAdapter
Adapter for sentence-transformers models.
Supports sentence-transformers models like "all-MiniLM-L6-v2", "all-mpnet-base-v2", etc. These models are optimized for generating sentence embeddings for semantic similarity tasks.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
str
|
Sentence transformer model identifier. |
required |
cache
|
ModelOutputCache
|
Cache instance for storing model outputs. |
required |
device
|
str | None
|
Device to run model on. If None, uses sentence-transformers default. |
None
|
model_version
|
str
|
Version string for cache tracking. |
'unknown'
|
normalize_embeddings
|
bool
|
Whether to normalize embeddings to unit length. |
True
|
Examples:
>>> from pathlib import Path
>>> from bead.items.cache import ModelOutputCache
>>> cache = ModelOutputCache(cache_dir=Path(".cache"))
>>> model = HuggingFaceSentenceTransformer("all-MiniLM-L6-v2", cache)
>>> embedding = model.get_embedding("The cat sat on the mat.")
>>> similarity = model.compute_similarity("The cat sat.", "The dog stood.")
model: SentenceTransformer
property
¶
Get the model, loading if necessary.
compute_log_probability(text: str) -> float
¶
Compute log probability of text.
Not supported for sentence transformer models.
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
Always raised, as sentence transformers don't provide log probabilities. |
compute_perplexity(text: str) -> float
¶
Compute perplexity of text.
Not supported for sentence transformer models.
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
Always raised, as sentence transformers don't provide perplexity. |
get_embedding(text: str) -> np.ndarray
¶
Get embedding vector for text.
Uses sentence-transformers encode() method to generate optimized sentence embeddings.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Text to embed. |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
Embedding vector for the text. |
compute_nli(premise: str, hypothesis: str) -> dict[str, float]
¶
Compute natural language inference scores.
Not supported for sentence transformer models.
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
Always raised, as sentence transformers don't support NLI directly. |
compute_similarity(text1: str, text2: str) -> float
¶
Compute similarity between two texts.
Uses cosine similarity of embeddings. For sentence transformers, this is optimized as embeddings are already normalized (if normalize_embeddings=True).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text1
|
str
|
First text. |
required |
text2
|
str
|
Second text. |
required |
Returns:
| Type | Description |
|---|---|
float
|
Similarity score in [-1, 1] (cosine similarity). |
registry
¶
Model adapter registry for centralized adapter management.
This module provides a registry for managing all model adapters, both local (HuggingFace) and API-based (OpenAI, Anthropic, etc.).
AdapterKwargs
¶
Bases: TypedDict
Keyword arguments for adapter initialization.
ModelAdapterRegistry
¶
Registry for all model adapters (local and API-based).
Provides centralized management of adapter types and instances, with automatic instance caching to avoid redundant initialization.
Attributes:
| Name | Type | Description |
|---|---|---|
adapters |
dict[str, type[ModelAdapter]]
|
Registered adapter classes keyed by adapter type name. |
instances |
dict[str, ModelAdapter]
|
Cached adapter instances keyed by unique identifier. |
register(name: str, adapter_class: type[ModelAdapter]) -> None
¶
Register an adapter class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Unique name for the adapter type (e.g., "openai", "huggingface_lm"). |
required |
adapter_class
|
type[ModelAdapter]
|
Adapter class to register (must inherit from ModelAdapter). |
required |
Raises:
| Type | Description |
|---|---|
ValueError
|
If adapter class does not inherit from ModelAdapter. |
get_adapter(adapter_type: str, model_name: str, **kwargs: Unpack[AdapterKwargs]) -> ModelAdapter
¶
Get or create adapter instance (with caching).
Creates a new adapter instance if not cached, otherwise returns the cached instance. Instances are cached by adapter type and model name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
adapter_type
|
str
|
Type of adapter (must be registered). |
required |
model_name
|
str
|
Model identifier for the adapter. |
required |
**kwargs
|
Unpack[AdapterKwargs]
|
Additional keyword arguments to pass to adapter constructor (api_key, device, model_version, embedding_model, etc.). |
{}
|
Returns:
| Type | Description |
|---|---|
ModelAdapter
|
Adapter instance (cached or newly created). |
Raises:
| Type | Description |
|---|---|
ValueError
|
If adapter type is not registered. |
Examples:
clear_cache() -> None
¶
Clear all cached adapter instances.
Useful for testing or when you want to force recreation of adapters with different parameters.
list_adapters() -> list[str]
¶
List all registered adapter types.
Returns:
| Type | Description |
|---|---|
list[str]
|
List of registered adapter type names. |
api_utils
¶
Utilities for API-based model adapters.
This module provides shared utilities for API-based model adapters, including retry logic with exponential backoff and rate limiting.
RateLimiter
¶
Rate limiter for API calls.
Tracks call timestamps and enforces a maximum rate of calls per minute. Uses a sliding window algorithm to ensure the rate limit is respected.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
calls_per_minute
|
int
|
Maximum number of calls allowed per minute (default: 60). |
60
|
Attributes:
| Name | Type | Description |
|---|---|---|
calls_per_minute |
int
|
Maximum number of calls allowed per minute. |
call_times |
list[float]
|
Timestamps of recent API calls. |
wait_if_needed() -> None
¶
Wait if rate limit would be exceeded.
Checks if making a call now would exceed the rate limit. If so, sleeps until enough time has passed.
retry_with_backoff(max_retries: int = 3, initial_delay: float = 1.0, backoff_factor: float = 2.0, exceptions: tuple[type[Exception], ...] = (Exception,)) -> Callable[[Callable[..., T]], Callable[..., T]]
¶
Decorate function with retry logic and exponential backoff.
Retries a function call on specified exceptions with exponential backoff between attempts. The delay between retries grows exponentially: delay = initial_delay * (backoff_factor ** attempt).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
max_retries
|
int
|
Maximum number of retry attempts (default: 3). |
3
|
initial_delay
|
float
|
Initial delay in seconds before first retry (default: 1.0). |
1.0
|
backoff_factor
|
float
|
Multiplicative factor for delay between retries (default: 2.0). |
2.0
|
exceptions
|
tuple[type[Exception], ...]
|
Tuple of exception types to catch and retry on (default: (Exception,)). |
(Exception,)
|
Returns:
| Type | Description |
|---|---|
Callable
|
Decorated function with retry logic. |
Examples:
rate_limit(calls_per_minute: int = 60) -> Callable[[Callable[P, T]], Callable[P, T]]
¶
Decorate function with rate limiting for API calls.
Enforces a maximum rate of API calls per minute using a shared RateLimiter instance. Calls that would exceed the rate limit will block until the limit resets.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
calls_per_minute
|
int
|
Maximum number of calls allowed per minute (default: 60). |
60
|
Returns:
| Type | Description |
|---|---|
Callable
|
Decorated function with rate limiting. |
Examples: