bead.templates¶
Stage 2 of the bead pipeline: template filling strategies and constraint resolution.
Template Filling¶
filler
¶
Template filling with backtracking search and constraint propagation.
This module implements a CSP (Constraint Satisfaction Problem) solver for template filling. It uses backtracking search with forward checking to efficiently find valid slot fillings that satisfy all constraints.
TemplateFiller
¶
Bases: ABC
Abstract base class for template filling.
Subclasses implement different approaches to filling template slots with lexical items from a lexicon. Strategies include constraint satisfaction solving (CSP) and enumeration-based strategies.
Examples:
>>> from bead.templates.filler import CSPFiller
>>> filler = CSPFiller(lexicon)
>>> filled = list(filler.fill(template))
fill(template: Template, language_code: LanguageCode | None = None) -> Iterable[FilledTemplate]
abstractmethod
¶
Fill template slots with lexical items.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
template
|
Template
|
Template to fill. |
required |
language_code
|
LanguageCode | None
|
Optional language code to filter items. |
None
|
Returns:
| Type | Description |
|---|---|
Iterable[FilledTemplate]
|
Filled template instances (iterator or list). |
Raises:
| Type | Description |
|---|---|
ValueError
|
If template cannot be filled. |
ConstraintUnsatisfiableError
¶
Bases: Exception
Raised when template constraints cannot be satisfied.
This error indicates that the backtracking search exhausted all possibilities without finding a valid assignment.
Attributes:
| Name | Type | Description |
|---|---|---|
template_name |
str
|
Name of the template that could not be filled. |
slot_name |
str | None
|
Name of the slot where filling failed (if known). |
attempted_combinations |
int
|
Number of partial assignments tried before failure. |
message |
str
|
Diagnostic message explaining the failure. |
Examples:
>>> raise ConstraintUnsatisfiableError(
... template_name="transitive",
... slot_name="verb",
... attempted_combinations=1523,
... message="No VERB items satisfy agreement constraints"
... )
FilledTemplate
¶
Bases: BeadBaseModel
A template populated with lexical items.
Represents a single instance of a template with specific items filling each slot.
Attributes:
| Name | Type | Description |
|---|---|---|
template_id |
str
|
ID of the source template. |
template_name |
str
|
Name of the source template. |
slot_fillers |
dict[str, LexicalItem]
|
Mapping of slot names to items that fill them. |
rendered_text |
str
|
Template string with slots replaced by item lemmas. |
strategy_name |
str
|
Name of strategy used to generate this filled template. |
template_slots |
dict[str, bool]
|
Mapping of all template slot names to whether they are required. Used to determine unfilled slots. |
Examples:
>>> filled = FilledTemplate(
... template_id="t1",
... template_name="transitive",
... slot_fillers={"subject": noun_item, "verb": verb_item},
... rendered_text="cat broke the object",
... strategy_name="exhaustive",
... template_slots={"subject": True, "verb": True, "object": True}
... )
unfilled_slots: set[str]
property
¶
Get names of slots that were not filled.
Returns:
| Type | Description |
|---|---|
set[str]
|
Set of slot names present in template but not in slot_fillers. |
unfilled_required_slots: set[str]
property
¶
Get names of required slots that were not filled.
Returns:
| Type | Description |
|---|---|
set[str]
|
Set of required slot names that are unfilled. |
is_complete: bool
property
¶
Check if all required slots are filled.
Returns:
| Type | Description |
|---|---|
bool
|
True if all required slots have fillers. |
CSPFiller
¶
Bases: TemplateFiller
Fill templates using backtracking search with forward checking.
Implements a CSP (Constraint Satisfaction Problem) solver with these guarantees: 1. Completeness: Will find a solution if one exists 2. Correctness: All returned assignments satisfy all constraints 3. Termination: Will halt (either with solution or error)
The algorithm uses: - Backtracking search to explore assignment space - Forward checking to prune search space early - Most-constrained-first slot ordering heuristic - Constraint propagation for multi-slot constraints
Use this filler when templates have multi-slot constraints (Template.constraints) that require agreement or relational checking. For simple templates with only single-slot constraints, StrategyFiller is 10-100x faster.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
lexicon
|
Lexicon
|
Lexicon containing candidate items. |
required |
max_attempts
|
int
|
Maximum number of partial assignments to try (default: 10000). |
10000
|
renderer
|
TemplateRenderer | None
|
Template renderer to use for generating rendered_text. If None, uses DefaultRenderer() which does simple slot substitution. |
None
|
Examples:
>>> from bead.resources.lexicon import Lexicon
>>> from bead.templates.filler import CSPFiller
>>> lexicon = Lexicon(items=[...])
>>> filler = CSPFiller(lexicon)
>>> try:
... filled = next(filler.fill(template))
... except ConstraintUnsatisfiableError as e:
... print(f"Could not fill: {e}")
fill(template: Template, language_code: LanguageCode | None = None, count: int = 1) -> Iterator[FilledTemplate]
¶
Fill template with lexical items using backtracking search.
Yields filled templates one at a time as they are found.
Stops after yielding count templates or exhausting possibilities.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
template
|
Template
|
Template to fill. |
required |
language_code
|
LanguageCode | None
|
Optional language code to filter lexicon items. |
None
|
count
|
int
|
Maximum number of filled templates to generate (default: 1). |
1
|
Yields:
| Type | Description |
|---|---|
FilledTemplate
|
Filled template instance satisfying all constraints. |
Raises:
| Type | Description |
|---|---|
ConstraintUnsatisfiableError
|
If no valid assignment exists after exhaustive search. |
ValueError
|
If template has no slots or invalid structure. |
Examples:
strategies
¶
Filling strategies for template population.
FillingStrategy
¶
Bases: ABC
Abstract base class for template filling strategies.
A filling strategy determines how to combine lexical items to fill template slots. Strategies differ in: - Selection criteria (all vs. sample) - Ordering (deterministic vs. random) - Grouping (balanced vs. unbalanced)
Examples:
>>> strategy = ExhaustiveStrategy()
>>> combinations = strategy.generate_combinations(slot_items)
>>> len(list(combinations))
12
name: str
abstractmethod
property
¶
Get strategy name for metadata.
Returns:
| Type | Description |
|---|---|
str
|
Strategy name. |
generate_combinations(slot_items: dict[str, list[LexicalItem]]) -> list[dict[str, LexicalItem]]
abstractmethod
¶
Generate combinations of items for template slots.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
slot_items
|
dict[str, list[LexicalItem]]
|
Mapping of slot names to lists of valid items. |
required |
Returns:
| Type | Description |
|---|---|
list[dict[str, LexicalItem]]
|
List of slot-to-item mappings representing filled templates. |
Examples:
ExhaustiveStrategy
¶
Bases: FillingStrategy
Generate all possible combinations of slot fillers.
This strategy produces the complete Cartesian product of all valid items for each slot. Use for small combinatorial spaces.
Warning: Combinatorial explosion! With N slots and M items per slot, generates M^N combinations.
Examples:
>>> strategy = ExhaustiveStrategy()
>>> slot_items = {"a": [1, 2], "b": [3, 4]}
>>> combinations = strategy.generate_combinations(slot_items)
>>> len(combinations)
4
>>> combinations[0]
{"a": 1, "b": 3}
name: str
property
¶
Get strategy name.
generate_combinations(slot_items: dict[str, list[LexicalItem]]) -> list[dict[str, LexicalItem]]
¶
Generate all combinations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
slot_items
|
dict[str, list[LexicalItem]]
|
Mapping of slot names to valid items. |
required |
Returns:
| Type | Description |
|---|---|
list[dict[str, LexicalItem]]
|
All possible slot-to-item combinations. |
RandomStrategy
¶
Bases: FillingStrategy
Generate random sample of combinations.
Sample combinations randomly with optional seeding for reproducibility. Use for large combinatorial spaces.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n_samples
|
int
|
Number of combinations to generate. |
required |
seed
|
int | None
|
Random seed for reproducibility. Default: None. |
None
|
Examples:
>>> strategy = RandomStrategy(n_samples=10, seed=42)
>>> combinations = strategy.generate_combinations(slot_items)
>>> len(combinations)
10
name: str
property
¶
Get strategy name.
__init__(n_samples: int, seed: int | None = None) -> None
¶
Initialize random strategy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n_samples
|
int
|
Number of combinations to generate. |
required |
seed
|
int | None
|
Random seed for reproducibility. |
None
|
generate_combinations(slot_items: dict[str, list[LexicalItem]]) -> list[dict[str, LexicalItem]]
¶
Generate random combinations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
slot_items
|
dict[str, list[LexicalItem]]
|
Mapping of slot names to valid items. |
required |
Returns:
| Type | Description |
|---|---|
list[dict[str, LexicalItem]]
|
Randomly sampled combinations. |
StratifiedStrategy
¶
Bases: FillingStrategy
Generate balanced sample across item groups.
Ensure each group of items (e.g., by POS, features) is represented proportionally in the sample.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n_samples
|
int
|
Total number of combinations to generate. |
required |
grouping_property
|
str
|
Property to group items by (e.g., "pos", "features.transitivity"). |
required |
seed
|
int | None
|
Random seed for reproducibility. Default: None. |
None
|
Examples:
>>> strategy = StratifiedStrategy(
... n_samples=20,
... grouping_property="pos",
... seed=42
... )
>>> combinations = strategy.generate_combinations(slot_items)
>>> # Ensures balanced representation of different POS values
name: str
property
¶
Get strategy name.
__init__(n_samples: int, grouping_property: str, seed: int | None = None) -> None
¶
Initialize stratified strategy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n_samples
|
int
|
Total number of combinations to generate. |
required |
grouping_property
|
str
|
Property to group items by. |
required |
seed
|
int | None
|
Random seed for reproducibility. |
None
|
generate_combinations(slot_items: dict[str, list[LexicalItem]]) -> list[dict[str, LexicalItem]]
¶
Generate stratified combinations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
slot_items
|
dict[str, list[LexicalItem]]
|
Mapping of slot names to valid items. |
required |
Returns:
| Type | Description |
|---|---|
list[dict[str, LexicalItem]]
|
Balanced combinations across groups. |
MLMFillingStrategy
¶
Bases: FillingStrategy
Fill templates using masked language models with beam search.
Uses pre-trained MLMs (BERT, RoBERTa, etc.) to propose linguistically plausible slot fillers. Supports beam search for multiple slots with configurable fill directions.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
resolver
|
ConstraintResolver
|
Constraint resolver for filtering candidates |
required |
model_adapter
|
HuggingFaceMLMAdapter
|
Loaded MLM adapter |
required |
beam_size
|
int
|
Beam search width (K best hypotheses) |
5
|
fill_direction
|
Literal
|
Direction for filling slots. One of: "left_to_right", "right_to_left", "inside_out", "outside_in", "custom" |
'left_to_right'
|
custom_order
|
list[int] | None
|
Custom slot fill order (slot indices) |
None
|
top_k
|
int
|
Top-K candidates per slot from MLM |
20
|
cache
|
ModelOutputCache | None
|
Cache for model predictions |
None
|
budget
|
int | None
|
Maximum combinations to generate |
None
|
Examples:
>>> from bead.templates.adapters import HuggingFaceMLMAdapter, ModelOutputCache
>>> adapter = HuggingFaceMLMAdapter("bert-base-uncased")
>>> adapter.load_model()
>>> cache = ModelOutputCache(Path("/tmp/cache"))
>>> strategy = MLMFillingStrategy(
... resolver=resolver,
... model_adapter=adapter,
... beam_size=5,
... fill_direction="left_to_right",
... cache=cache
... )
>>> combinations = strategy.generate_combinations(slot_items)
name: str
property
¶
Get strategy name.
__init__(resolver: ConstraintResolver, model_adapter: HuggingFaceMLMAdapter, beam_size: int = 5, fill_direction: Literal['left_to_right', 'right_to_left', 'inside_out', 'outside_in', 'custom'] = 'left_to_right', custom_order: list[int] | None = None, top_k: int = 20, cache: ModelOutputCache | None = None, budget: int | None = None, per_slot_max_fills: dict[str, int] | None = None, per_slot_enforce_unique: dict[str, bool] | None = None) -> None
¶
Initialize MLM strategy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
resolver
|
ConstraintResolver
|
Constraint resolver |
required |
model_adapter
|
HuggingFaceMLMAdapter
|
MLM adapter (must be loaded) |
required |
beam_size
|
int
|
Beam width |
5
|
fill_direction
|
str
|
Fill direction |
'left_to_right'
|
custom_order
|
list[int] | None
|
Custom fill order |
None
|
top_k
|
int
|
Top-K from MLM |
20
|
cache
|
ModelOutputCache | None
|
Prediction cache |
None
|
budget
|
int | None
|
Max combinations |
None
|
per_slot_max_fills
|
dict[str, int] | None
|
Maximum number of unique fills per slot (after constraint filtering) |
None
|
per_slot_enforce_unique
|
dict[str, bool] | None
|
Whether to enforce uniqueness for each slot across beam hypotheses |
None
|
generate_combinations(slot_items: dict[str, list[LexicalItem]]) -> list[dict[str, LexicalItem]]
¶
Generate combinations using MLM beam search.
Note: This method adapts slot_items to template-based generation. The actual beam search is implemented in generate_from_template.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
slot_items
|
dict[str, list[LexicalItem]]
|
Mapping of slot names to valid items (for constraint filtering) |
required |
Returns:
| Type | Description |
|---|---|
list[dict[str, LexicalItem]]
|
Combinations generated via beam search |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
This method requires template context. Use generate_from_template instead. |
generate_from_template(template: Template, lexicons: list[Lexicon], language_code: LanguageCode | None = None) -> Iterator[dict[str, LexicalItem]]
¶
Generate combinations from template using beam search.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
template
|
Template
|
Template to fill |
required |
lexicons
|
list[Lexicon]
|
Lexicons for constraint resolution |
required |
language_code
|
LanguageCode | None
|
Language filter |
None
|
Yields:
| Type | Description |
|---|---|
dict[str, LexicalItem]
|
Slot-to-item mappings |
StrategyFiller
¶
Bases: TemplateFiller
Strategy-based template filling for simple templates.
Fast filling using enumeration strategies (exhaustive, random, stratified). Does NOT handle template-level multi-slot constraints (Template.constraints).
For templates with multi-slot constraints requiring agreement or relational checks, use CSPFiller instead.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
lexicon
|
Lexicon
|
Lexicon containing candidate items. |
required |
strategy
|
FillingStrategy
|
Strategy for generating combinations. |
required |
Examples:
>>> from bead.templates.strategies import StrategyFiller, ExhaustiveStrategy
>>> filler = StrategyFiller(lexicon, ExhaustiveStrategy())
>>> filled = filler.fill(template)
>>> len(filled)
12
fill(template: Template, language_code: LanguageCode | None = None) -> list[FilledTemplate]
¶
Fill template with lexical items using strategy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
template
|
Template
|
Template to fill. |
required |
language_code
|
LanguageCode | None
|
Optional language code to filter items. |
None
|
Returns:
| Type | Description |
|---|---|
list[FilledTemplate]
|
List of all filled template instances. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If any slot has no valid items. |
MixedFillingStrategy
¶
Bases: FillingStrategy
Fill different template slots using different strategies.
Allows per-slot strategy specification, enabling workflows like: - Fill nouns/verbs exhaustively - Fill adjectives via MLM based on noun context
This strategy operates in two steps: 1. First pass: Fill slots assigned to non-MLM strategies (exhaustive, random, etc.) 2. Second pass: For each first pass combination, fill remaining slots via MLM
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
slot_strategies
|
dict[str, tuple[FillingStrategy, dict]]
|
Mapping of slot names to (strategy, config) tuples. Config is strategy-specific kwargs. |
required |
default_strategy
|
FillingStrategy | None
|
Default strategy for slots not explicitly specified. |
None
|
Examples:
>>> exhaustive = ExhaustiveStrategy()
>>> mlm_config = {
... "resolver": resolver,
... "model_adapter": mlm_adapter,
... "top_k": 5
... }
>>> strategy = MixedFillingStrategy(
... slot_strategies={
... "noun": (exhaustive, {}),
... "verb": (exhaustive, {}),
... "adjective": ("mlm", mlm_config)
... }
... )
name: str
property
¶
Get strategy name.
__init__(slot_strategies: dict[str, tuple[str | FillingStrategy, StrategyConfig]], default_strategy: FillingStrategy | None = None) -> None
¶
Initialize mixed strategy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
slot_strategies
|
dict[str, tuple[str | FillingStrategy, StrategyConfig]]
|
Mapping slot names to (strategy_name, config) or (strategy_instance, config). strategy_name can be: "exhaustive", "random", "stratified", "mlm" |
required |
default_strategy
|
FillingStrategy | None
|
Default strategy for unspecified slots. |
None
|
generate_combinations(slot_items: dict[str, list[LexicalItem]]) -> list[dict[str, LexicalItem]]
¶
Generate combinations using mixed strategies.
Note: This method signature is required by FillingStrategy, but MixedFillingStrategy with MLM requires template context. Use generate_from_template instead.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
slot_items
|
dict[str, list[LexicalItem]]
|
Mapping of slot names to valid items |
required |
Returns:
| Type | Description |
|---|---|
list[dict[str, LexicalItem]]
|
Generated combinations |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
If any slot uses MLM strategy (requires template context) |
generate_from_template(template: Template, lexicons: list[Lexicon], language_code: LanguageCode | None = None) -> Iterator[dict[str, LexicalItem]]
¶
Generate combinations from template using mixed strategies.
First pass: Fill non-MLM slots using their assigned strategies Second pass: For each first pass combination, fill MLM slots using beam search
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
template
|
Template
|
Template to fill |
required |
lexicons
|
list[Lexicon]
|
Lexicons for constraint resolution |
required |
language_code
|
LanguageCode | None
|
Language filter |
None
|
Yields:
| Type | Description |
|---|---|
dict[str, LexicalItem]
|
Complete slot-to-item mappings |
Constraint Resolution¶
resolver
¶
Constraint resolution for template slot filling.
This module provides the ConstraintResolver class for evaluating constraints against lexical items to determine which items satisfy template slot requirements. The resolver is now a thin wrapper around DSLEvaluator.
ConstraintResolver
¶
Simplified resolver that wraps DSLEvaluator.
The ConstraintResolver evaluates DSL constraint expressions against lexical items and filled slots. It provides two main methods: - evaluate_slot_constraints: for single-slot constraints - evaluate_template_constraints: for multi-slot constraints
All constraint logic is delegated to the DSL evaluator.
Examples:
>>> from bead.resources.models import LexicalItem
>>> from bead.resources.constraints import Constraint
>>> resolver = ConstraintResolver()
>>> item = LexicalItem(lemma="walk", pos="VERB")
>>> constraints = [
... Constraint(expression="self.pos == 'VERB'")
... ]
>>> resolver.evaluate_slot_constraints(item, constraints)
True
__init__() -> None
¶
Initialize resolver with DSL evaluator.
evaluate_slot_constraints(item: LexicalItem, constraints: list[Constraint], context: dict[str, ContextValue] | None = None) -> bool
¶
Evaluate single-slot constraints.
Single-slot constraints are evaluated with 'self' referring to the lexical item being checked. The item is available as 'self' in the DSL expression context.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
item
|
LexicalItem
|
Lexical item to evaluate constraints against. |
required |
constraints
|
list[Constraint]
|
Single-slot constraints from Slot.constraints. |
required |
context
|
dict[str, ContextValue] | None
|
Additional context variables (optional). |
None
|
Returns:
| Type | Description |
|---|---|
bool
|
True if all constraints are satisfied, False otherwise. |
Examples:
>>> from bead.resources.models import LexicalItem
>>> from bead.resources.constraints import Constraint
>>> resolver = ConstraintResolver()
>>> item = LexicalItem(lemma="walk", pos="VERB")
>>> constraints = [
... Constraint(
... expression="self.lemma in motion_verbs",
... context={"motion_verbs": {"walk", "run", "jump"}}
... )
... ]
>>> resolver.evaluate_slot_constraints(item, constraints)
True
evaluate_template_constraints(filled_slots: dict[str, LexicalItem], constraints: list[Constraint], context: dict[str, ContextValue] | None = None) -> bool
¶
Evaluate multi-slot constraints.
Multi-slot constraints are evaluated with slot names as variables. For example, a constraint like "subject.features.number == verb.features.number" would have access to both the 'subject' and 'verb' slot fillers.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filled_slots
|
dict[str, LexicalItem]
|
Dictionary mapping slot names to their filled items. |
required |
constraints
|
list[Constraint]
|
Multi-slot constraints from Template.constraints. |
required |
context
|
dict[str, ContextValue] | None
|
Additional context variables (optional). |
None
|
Returns:
| Type | Description |
|---|---|
bool
|
True if all constraints are satisfied, False otherwise. |
Examples:
>>> from bead.resources.models import LexicalItem
>>> from bead.resources.constraints import Constraint
>>> resolver = ConstraintResolver()
>>> subject = LexicalItem(
... lemma="dog",
... pos="NOUN",
... features={"number": "singular"}
... )
>>> verb = LexicalItem(
... lemma="runs",
... pos="VERB",
... features={"number": "singular"}
... )
>>> filled = {"subject": subject, "verb": verb}
>>> constraints = [
... Constraint(
... expression="subject.features.number == verb.features.number"
... )
... ]
>>> resolver.evaluate_template_constraints(filled, constraints)
True
clear_cache() -> None
¶
Utilities¶
combinatorics
¶
Combinatorial utilities for template filling.
cartesian_product(*iterables: list[T]) -> Iterator[tuple[T, ...]]
¶
Generate Cartesian product of iterables.
Equivalent to itertools.product but with explicit type hints and documentation for template filling use case.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
*iterables
|
list[T]
|
Variable number of iterables to combine. |
()
|
Yields:
| Type | Description |
|---|---|
tuple[T, ...]
|
Each combination from the Cartesian product. |
Examples:
count_combinations(*iterables: list[T]) -> int
¶
Count total combinations without generating them.
Calculate the size of the Cartesian product space efficiently without actually generating combinations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
*iterables
|
list[Any]
|
Variable number of iterables. |
()
|
Returns:
| Type | Description |
|---|---|
int
|
Total number of combinations. |
Examples:
stratified_sample(groups: dict[str, list[T]], n_per_group: int, seed: int | None = None) -> list[T]
¶
Sample items from groups with balanced representation.
Ensure each group is represented approximately equally in the sample.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
groups
|
dict[str, list[T]]
|
Dictionary mapping group names to lists of items. |
required |
n_per_group
|
int
|
Number of items to sample from each group. |
required |
seed
|
int | None
|
Random seed for reproducibility. |
None
|
Returns:
| Type | Description |
|---|---|
list[T]
|
Sampled items, balanced across groups. |
Examples:
streaming
¶
Streaming template filling for large combinatorial spaces.
StreamingFiller
¶
Fill templates with lazy evaluation.
Generate filled templates one at a time without storing all combinations in memory. Use for very large combinatorial spaces where ExhaustiveStrategy would cause OOM.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
lexicon
|
Lexicon
|
Lexicon containing candidate items. |
required |
adapter_registry
|
AdapterRegistry | None
|
Adapter registry for resource-based constraints. |
None
|
max_combinations
|
int | None
|
Maximum combinations to generate. Default: None (unlimited). |
None
|
Examples:
>>> filler = StreamingFiller(lexicon, max_combinations=1000)
>>> for filled in filler.stream(template):
... process(filled) # process one at a time
... if some_condition:
... break # can stop early
stream(template: Template, language_code: LanguageCode | None = None) -> Iterator[FilledTemplate]
¶
Stream filled templates lazily.
Generate filled templates one at a time using lazy evaluation. Memory-efficient for large combinatorial spaces.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
template
|
Template
|
Template to fill. |
required |
language_code
|
LanguageCode | None
|
Optional language filter. |
None
|
Yields:
| Type | Description |
|---|---|
FilledTemplate
|
Filled template instances. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If any slot has no valid items. |
Examples:
Template Adapters¶
base
¶
Base adapter for template filling models.
This module defines the abstract interface for models used in template filling. These adapters are SEPARATE from judgment prediction model adapters (Stage 6).
TemplateFillingModelAdapter
¶
Bases: ABC
Base adapter for models used in template filling.
This is SEPARATE from judgment prediction model adapters, which are used later in the pipeline for predicting human judgments.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
str
|
Model identifier (e.g., "bert-base-uncased") |
required |
device
|
str
|
Computation device ("cpu", "cuda", "mps") |
'cpu'
|
cache_dir
|
Path | None
|
Directory for caching model files |
None
|
Examples:
>>> from bead.templates.adapters import TemplateFillingModelAdapter
>>> # Implemented by HuggingFaceMLMAdapter
>>> adapter = HuggingFaceMLMAdapter("bert-base-uncased", device="cpu")
>>> adapter.load_model()
>>> predictions = adapter.predict_masked_token(
... text="The cat [MASK] on the mat",
... mask_position=2,
... top_k=5
... )
>>> adapter.unload_model()
load_model() -> None
abstractmethod
¶
Load model into memory.
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If model loading fails |
unload_model() -> None
abstractmethod
¶
Unload model from memory to free resources.
predict_masked_token(text: str, mask_position: int, top_k: int = 10) -> list[tuple[str, float]]
abstractmethod
¶
Predict masked token at specified position.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Text with mask token (e.g., "The cat [MASK] quickly") |
required |
mask_position
|
int
|
Token position of mask (0-indexed) |
required |
top_k
|
int
|
Number of top predictions to return |
10
|
Returns:
| Type | Description |
|---|---|
list[tuple[str, float]]
|
List of (token, log_probability) tuples, sorted by probability |
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If model is not loaded |
ValueError
|
If mask_position is invalid |
Examples:
is_loaded() -> bool
¶
Check if model is loaded.
Returns:
| Type | Description |
|---|---|
bool
|
True if model is loaded in memory |
__enter__() -> TemplateFillingModelAdapter
¶
Context manager entry.
__exit__(*args: object) -> None
¶
Context manager exit.
huggingface
¶
HuggingFace masked language model adapter for template filling.
HuggingFaceMLMAdapter
¶
Bases: HuggingFaceAdapterMixin, TemplateFillingModelAdapter
Adapter for HuggingFace masked language models.
Supports BERT, RoBERTa, ALBERT, and other MLM architectures.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
str
|
HuggingFace model identifier (e.g., "bert-base-uncased") |
required |
device
|
DeviceType
|
Computation device ("cpu", "cuda", "mps") |
'cpu'
|
cache_dir
|
Path | None
|
Directory for caching model files |
None
|
Examples:
>>> adapter = HuggingFaceMLMAdapter("bert-base-uncased", device="cpu")
>>> adapter.load_model()
>>> predictions = adapter.predict_masked_token(
... text="The cat sat on the mat",
... mask_position=2,
... top_k=5
... )
>>> for token, log_prob in predictions:
... print(f"{token}: {log_prob:.2f}")
>>> adapter.unload_model()
load_model() -> None
¶
Load model and tokenizer from HuggingFace.
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If model loading fails |
unload_model() -> None
¶
Unload model from memory.
predict_masked_token(text: str, mask_position: int, top_k: int = 10) -> list[tuple[str, float]]
¶
Predict masked token at specified position.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Text with mask token (e.g., "The cat [MASK] quickly") |
required |
mask_position
|
int
|
Token position of mask (0-indexed) |
required |
top_k
|
int
|
Number of top predictions to return |
10
|
Returns:
| Type | Description |
|---|---|
list[tuple[str, float]]
|
List of (token, log_probability) tuples, sorted by probability |
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If model is not loaded |
ValueError
|
If mask_position is invalid or text has no mask token |
predict_masked_token_batch(texts: list[str], mask_position: int = 0, top_k: int = 10) -> list[list[tuple[str, float]]]
¶
Predict masked tokens for multiple texts in a single batch.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
texts
|
list[str]
|
List of texts with mask tokens |
required |
mask_position
|
int
|
Token position of mask (0-indexed, relative to mask tokens found) |
0
|
top_k
|
int
|
Number of top predictions to return per text |
10
|
Returns:
| Type | Description |
|---|---|
list[list[tuple[str, float]]]
|
List of predictions for each text. Each element is a list of (token, log_probability) tuples. |
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If model is not loaded |
ValueError
|
If any text has no mask token |
get_mask_token() -> str
¶
Get the mask token for this model.
Returns:
| Type | Description |
|---|---|
str
|
Mask token string (e.g., "[MASK]" for BERT) |
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If model is not loaded |
ValueError
|
If model has no mask token |
cache
¶
Content-addressable cache for model predictions.
This module implements caching for template filling model predictions using SHA256-based content addressing.
ModelOutputCache
¶
Content-addressable cache for model predictions.
Uses SHA256 hashing to create deterministic cache keys based on: - Model name - Input text - Mask position - Top-K parameter
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cache_dir
|
Path
|
Directory for cache storage |
required |
enabled
|
bool
|
Enable/disable caching |
True
|
Examples:
>>> cache = ModelOutputCache(cache_dir=Path("/tmp/cache"), enabled=True)
>>> key_args = ("bert-base-uncased", "The cat [MASK]", 2, 10)
>>> predictions = cache.get(*key_args)
>>> if predictions is None:
... predictions = model.predict(...)
... cache.set(*key_args, predictions)
get(model_name: str, input_text: str, mask_position: int, top_k: int) -> list[tuple[str, float]] | None
¶
Get cached predictions.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
str
|
Model identifier |
required |
input_text
|
str
|
Input text |
required |
mask_position
|
int
|
Mask position |
required |
top_k
|
int
|
Number of predictions |
required |
Returns:
| Type | Description |
|---|---|
list[tuple[str, float]] | None
|
Cached predictions or None if not found |
set(model_name: str, input_text: str, mask_position: int, top_k: int, predictions: list[tuple[str, float]]) -> None
¶
Store predictions in cache.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
str
|
Model identifier |
required |
input_text
|
str
|
Input text |
required |
mask_position
|
int
|
Mask position |
required |
top_k
|
int
|
Number of predictions |
required |
predictions
|
list[tuple[str, float]]
|
Predictions to cache |
required |
clear() -> None
¶
Clear all cached predictions.