Chart Parsers¶

Differentiable CKY chart parsers implemented as weighted deductive systems. ChartParser extends DeductiveSystem, composing LexicalAxiom, BinarySpanDeduction, UnarySpanDeduction, SpanGoal, and CKYSchedule into a single nn.Module. Supports learnable rule weights and semiring-parameterized scoring (log-probability, Viterbi, boolean, counting).

Use ChartParser.from_schema(schema, category_system, ...) to construct a parser from composable RuleSchema objects, or ChartParser.from_category_system(...) to construct from an explicit RuleSystem.

Includes concrete CCGParser and LambekParser convenience subclasses.

parsers ¶

Chart parsers as weighted deductive systems.

A ChartParser is a DeductiveSystem that composes:

LexicalAxiom — populates length-1 spans from a learnable lexicon.
BinarySpanDeduction — applies binary structural rules.
UnarySpanDeduction — applies unary rules to convergence.
SpanGoal — extracts the start-symbol score for the full span.
CKYSchedule — bottom-up evaluation by span length.

Concrete grammar formalisms are specific choices of RuleSchema:

CCG = EVALUATION | HARMONIC_COMPOSITION | CROSSED_COMPOSITION
LAMBEK = EVALUATION | ADJUNCTION_UNITS | TENSOR_INTRODUCTION | TENSOR_PROJECTION
NL = EVALUATION

The ChartParser can also be constructed directly from a RuleSystem or from a RuleSchema + CategorySystem.

Categorical perspective

A chart parser implements a morphism

parse : FreeMonoid(T) -> 1

in the category of measurable spaces, mapping variable-length terminal strings to their log-probability under the grammar. The RuleSchema specifies the structural rules (the generating arrows of the free category), while the lexicon provides the interpretation functor mapping terminals to weighted category distributions.

The parsing algorithm is parameterized by a ChartSemiring (Goodman, 1999), making the parser agnostic to the scoring algebra.

ChartParser ¶

ChartParser(rule_system: RuleSystem, n_terminals: int, start_idx: int, category_system: CategorySystem | None = None, semiring: ChartSemiring | None = None, learnable_rule_weights: bool = True)

Bases: DeductiveSystem

Differentiable CKY chart parser as a weighted deductive system.

Composes LexicalAxiom, BinarySpanDeduction, UnarySpanDeduction, SpanGoal, and CKYSchedule into a single DeductiveSystem (nn.Module).

PARAMETER	DESCRIPTION
`rule_system`	The structural rules governing the grammar. TYPE: `RuleSystem`
`n_terminals`	Vocabulary size. TYPE: `int`
`start_idx`	Index of the start category in the rule system. TYPE: `int`
`category_system`	Optional reference to the category system (for introspection). TYPE: `CategorySystem or None` DEFAULT: `None`
`semiring`	Scoring algebra. Defaults to log-probability semiring. TYPE: `ChartSemiring or None` DEFAULT: `None`
`learnable_rule_weights`	Whether rule weights are learnable parameters. TYPE: `bool` DEFAULT: `True`

Source code in src/quivers/stochastic/parsers.py

def __init__(
    self,
    rule_system: RuleSystem,
    n_terminals: int,
    start_idx: int,
    category_system: CategorySystem | None = None,
    semiring: ChartSemiring | None = None,
    learnable_rule_weights: bool = True,
) -> None:
    axiom = LexicalAxiom(n_terminals, rule_system.n_categories)
    binary_ded = BinarySpanDeduction(
        rule_system,
        learnable=learnable_rule_weights,
    )
    unary_ded = UnarySpanDeduction(
        rule_system,
        learnable=learnable_rule_weights,
    )
    goal = SpanGoal(start_idx)
    schedule = CKYSchedule()

    super().__init__(
        axiom=axiom,
        deductions=[binary_ded, unary_ded],
        goal=goal,
        schedule=schedule,
        semiring=semiring,
    )

    self._rule_system = rule_system
    self._system = category_system
    self._n_term = n_terminals
    self._n_cat = rule_system.n_categories
    self._start_idx = start_idx

rule_system `property` ¶

rule_system: RuleSystem

The structural rule system.

category_system `property` ¶

category_system: CategorySystem | None

The category inventory (if available).

n_rules `property` ¶

n_rules: int

Number of binary combination rules.

n_unary_rules `property` ¶

n_unary_rules: int

Number of unary rules.

log_lexicon `property` ¶

log_lexicon: Tensor

Log-probabilities of lexical category assignments.

RETURNS	DESCRIPTION
`Tensor`	Shape `(n_terminals, n_categories)`.

from_schema `classmethod` ¶

from_schema(schema: RuleSchema, category_system: CategorySystem, n_terminals: int, start: str | Category = 'S', semiring: ChartSemiring | None = None, learnable_rule_weights: bool = True) -> ChartParser

Create a chart parser from a rule schema and category system.

PARAMETER	DESCRIPTION
`schema`	The composable rule schema (e.g. `CCG`, `LAMBEK`). TYPE: `RuleSchema`
`category_system`	The category inventory. TYPE: `CategorySystem`
`n_terminals`	Vocabulary size. TYPE: `int`
`start`	The start category. TYPE: `str or Category` DEFAULT: `'S'`
`semiring`	Scoring algebra. TYPE: `ChartSemiring or None` DEFAULT: `None`
`learnable_rule_weights`	Whether rule weights are learnable. TYPE: `bool` DEFAULT: `True`

RETURNS	DESCRIPTION
`ChartParser`	A new parser instance.

Source code in src/quivers/stochastic/parsers.py

@classmethod
def from_schema(
    cls,
    schema: RuleSchema,
    category_system: CategorySystem,
    n_terminals: int,
    start: str | Category = "S",
    semiring: ChartSemiring | None = None,
    learnable_rule_weights: bool = True,
) -> ChartParser:
    """Create a chart parser from a rule schema and category system.

    Parameters
    ----------
    schema : RuleSchema
        The composable rule schema (e.g. ``CCG``, ``LAMBEK``).
    category_system : CategorySystem
        The category inventory.
    n_terminals : int
        Vocabulary size.
    start : str or Category
        The start category.
    semiring : ChartSemiring or None
        Scoring algebra.
    learnable_rule_weights : bool
        Whether rule weights are learnable.

    Returns
    -------
    ChartParser
        A new parser instance.
    """
    rule_system = schema(category_system)

    return cls.from_category_system(
        category_system,
        rule_system,
        n_terminals,
        start,
        semiring,
        learnable_rule_weights,
    )

from_category_system `classmethod` ¶

from_category_system(category_system: CategorySystem, rule_system: RuleSystem, n_terminals: int, start: str | Category = 'S', semiring: ChartSemiring | None = None, learnable_rule_weights: bool = True) -> ChartParser

Create a chart parser from a category system and rules.

PARAMETER	DESCRIPTION
`category_system`	The category inventory. TYPE: `CategorySystem`
`rule_system`	The structural rules. TYPE: `RuleSystem`
`n_terminals`	Vocabulary size. TYPE: `int`
`start`	The start category. TYPE: `str or Category` DEFAULT: `'S'`
`semiring`	Scoring algebra. TYPE: `ChartSemiring or None` DEFAULT: `None`
`learnable_rule_weights`	Whether rule weights are learnable. TYPE: `bool` DEFAULT: `True`

RETURNS	DESCRIPTION
`ChartParser`	A new parser instance.

Source code in src/quivers/stochastic/parsers.py

@classmethod
def from_category_system(
    cls,
    category_system: CategorySystem,
    rule_system: RuleSystem,
    n_terminals: int,
    start: str | Category = "S",
    semiring: ChartSemiring | None = None,
    learnable_rule_weights: bool = True,
) -> ChartParser:
    """Create a chart parser from a category system and rules.

    Parameters
    ----------
    category_system : CategorySystem
        The category inventory.
    rule_system : RuleSystem
        The structural rules.
    n_terminals : int
        Vocabulary size.
    start : str or Category
        The start category.
    semiring : ChartSemiring or None
        Scoring algebra.
    learnable_rule_weights : bool
        Whether rule weights are learnable.

    Returns
    -------
    ChartParser
        A new parser instance.
    """
    if isinstance(start, str):
        start = AtomicCategory(name=start)

    if start not in category_system:
        raise ValueError(f"start category {start!r} not in category system")

    return cls(
        rule_system=rule_system,
        n_terminals=n_terminals,
        start_idx=category_system.index(start),
        category_system=category_system,
        semiring=semiring,
        learnable_rule_weights=learnable_rule_weights,
    )

forward ¶

forward(input: Tensor) -> Tensor

Compute sentence scores via CKY chart parsing.

PARAMETER	DESCRIPTION
`input`	Integer tensor of terminal indices. Shape `(batch, seq_len)` or `(seq_len,)` for a single sentence. TYPE: `Tensor`

RETURNS	DESCRIPTION
`Tensor`	Score of each sentence under the grammar.

Source code in src/quivers/stochastic/parsers.py

def forward(self, input: torch.Tensor) -> torch.Tensor:
    """Compute sentence scores via CKY chart parsing.

    Parameters
    ----------
    input : torch.Tensor
        Integer tensor of terminal indices. Shape
        ``(batch, seq_len)`` or ``(seq_len,)`` for a single
        sentence.

    Returns
    -------
    torch.Tensor
        Score of each sentence under the grammar.
    """
    squeeze = False

    if input.dim() == 1:
        input = input.unsqueeze(0)
        squeeze = True

    if input.shape[1] == 0:
        raise ValueError("cannot parse empty sentences")

    result = super().forward(input)

    if squeeze:
        return result.squeeze(0)

    return result

inside_chart ¶

inside_chart(tokens: Tensor) -> Tensor

Compute the full inside chart (for analysis/debugging).

PARAMETER	DESCRIPTION
`tokens`	Integer tensor of terminal indices. TYPE: `Tensor`

RETURNS	DESCRIPTION
`Tensor`	Full chart of shape `(batch, N, seq_len, seq_len+1)`.

Source code in src/quivers/stochastic/parsers.py

def inside_chart(self, tokens: torch.Tensor) -> torch.Tensor:
    """Compute the full inside chart (for analysis/debugging).

    Parameters
    ----------
    tokens : torch.Tensor
        Integer tensor of terminal indices.

    Returns
    -------
    torch.Tensor
        Full chart of shape ``(batch, N, seq_len, seq_len+1)``.
    """
    squeeze = False

    if tokens.dim() == 1:
        tokens = tokens.unsqueeze(0)
        squeeze = True

    if tokens.shape[1] == 0:
        raise ValueError("cannot parse empty sentences")

    # run axiom + schedule but skip goal extraction
    chart = self.axiom(tokens, self._semiring)
    chart = self._schedule.run(chart, self.deductions, self._semiring)

    if squeeze:
        return chart.squeeze(0)

    return chart

ccg ¶

Weighted Combinatory Categorial Grammar (CCG) parser.

Implements a differentiable CKY-style chart parser for weighted CCG, supporting the full set of standard combinatory rules:

Forward application (>): X/Y Y → X
Backward application (<): Y X\Y → X
Forward composition (>B): X/Y Y/Z → X/Z
Backward composition (<B): Y\Z X\Y → X\Z
Forward crossed composition (>Bx): X/Y Y\Z → X\Z
Backward crossed composition (<Bx): Y/Z X\Y → X/Z
Generalized forward composition (>B^n): X/Y Y|_1 Z_1 ... |_n Z_n → X|_1 Z_1 ... |_n Z_n
Forward type raising (>T): X → T/(T\X)
Backward type raising (<T): X → T(T/X)

All computation is in log-space for numerical stability. The lexicon (word → category distribution) is a learnable stochastic morphism, while the combinatory rules are structural (fixed by the grammar).

Categorical perspective

CCG is the internal language of a closed monoidal category. The slash categories X/Y and X\Y are left and right internal homs, and the application rules are evaluation morphisms (counits of the hom-tensor adjunction). Composition rules correspond to composition of internal hom morphisms.

Examples:

>>> from quivers.stochastic.categories import CategorySystem
>>> from quivers.stochastic.ccg import CCGParser
>>> cs = CategorySystem.from_atoms(["S", "NP", "N"])
>>> cs.add_slash(cs["S"], cs["NP"], "\\")   # S\NP
>>> cs.add_slash(cs["S\\NP"], cs["NP"], "/")  # (S\NP)/NP
>>> cs.add_slash(cs["NP"], cs["N"], "/")      # NP/N
>>> parser = CCGParser(cs, n_terminals=100, start="S")
>>> tokens = torch.randint(0, 100, (4, 5))
>>> log_probs = parser(tokens)

CCGParser ¶

CCGParser(category_system: CategorySystem, n_terminals: int, start: str | Category = 'S', enable_composition: bool = True, enable_crossed_composition: bool = True, enable_type_raising: bool = False, generalized_composition_depth: int = 1)

Bases: ChartParser

Weighted CKY chart parser for Combinatory Categorial Grammar.

A convenience subclass of ChartParser that builds CCG-specific rules from a CategorySystem and grammar options.

PARAMETER	DESCRIPTION
`category_system`	The finite inventory of grammatical categories. TYPE: `CategorySystem`
`n_terminals`	Vocabulary size. TYPE: `int`
`start`	The start (sentence) category. TYPE: `str or Category` DEFAULT: `'S'`
`enable_composition`	Enable harmonic composition (>B, <B). Default True. TYPE: `bool` DEFAULT: `True`
`enable_crossed_composition`	Enable crossed composition (>Bx, <Bx). Default True. TYPE: `bool` DEFAULT: `True`
`enable_type_raising`	Enable type raising (>T, <T). Default False. TYPE: `bool` DEFAULT: `False`
`generalized_composition_depth`	Maximum depth for generalized composition. Default 1. TYPE: `int` DEFAULT: `1`

Source code in src/quivers/stochastic/ccg.py

def __init__(
    self,
    category_system: CategorySystem,
    n_terminals: int,
    start: str | Category = "S",
    enable_composition: bool = True,
    enable_crossed_composition: bool = True,
    enable_type_raising: bool = False,
    generalized_composition_depth: int = 1,
) -> None:
    # resolve start category
    if isinstance(start, str):
        start_cat = AtomicCategory(name=start)

    else:
        start_cat = start

    if start_cat not in category_system:
        raise ValueError(f"start category {start_cat!r} not in category system")

    # build CCG rule system
    rules = ccg_rules(
        category_system,
        enable_composition=enable_composition,
        enable_crossed_composition=enable_crossed_composition,
        enable_type_raising=enable_type_raising,
        generalized_composition_depth=generalized_composition_depth,
    )

    super().__init__(
        rule_system=rules,
        n_terminals=n_terminals,
        start_idx=category_system.index(start_cat),
        category_system=category_system,
    )

lambek ¶

Weighted type-logical grammar (Lambek calculus) parser.

Implements a differentiable chart parser for the non-commutative Lambek calculus (L), supporting:

Right application (modus ponens right): A/B, B ⊢ A
Left application (modus ponens left): B, A\B ⊢ A (equivalently written B, B\A ⊢ A with the convention A\B = A seeks B on the left)
Right lifting (/L): A ⊢ B/(A\B)
Left lifting (\L): A ⊢ (B/A)\B
Product introduction (⊗I): A, B ⊢ A⊗B
Product elimination (⊗E): A⊗B ⊢ A, B (projected)

The Lambek calculus is the internal logic of a residuated monoidal (bi-closed) category:

A ⊗ B ⊢ C   iff   A ⊢ C/B   iff   B ⊢ A\\C

Unlike CCG, the Lambek calculus does NOT permit: - Composition as a primitive rule (it's derivable) - Crossed composition - Permuting rules

This makes the Lambek calculus resource-sensitive and order-preserving, modeling word order constraints naturally.

Extended calculi

This module also supports:

NL (Non-associative Lambek): If associative=False, the parser uses tree-structured proof search, disallowing reassociation of the product.
LP (Lambek with permutation): If commutative=True, the product is commutative, relaxing word order constraints.
Displacement calculus: Not yet supported; would require additional modalities.

Categorical perspective

The Lambek calculus is the internal logic of a residuated monoidal category (a biclosed monoidal category). The types are objects; the proofs are morphisms. A valid derivation A_1 ⊗ ... ⊗ A_n ⊢ B corresponds to a morphism in the category.

The parsing algorithm searches for such morphisms (proofs), computing the total weight (probability) of all derivations.

Examples:

>>> from quivers.stochastic.categories import CategorySystem
>>> from quivers.stochastic.lambek import LambekParser
>>> cs = CategorySystem.from_atoms(["S", "NP", "N"])
>>> cs.add_slash(cs["S"], cs["NP"], "\\")   # S\NP
>>> cs.add_slash(cs["S\\NP"], cs["NP"], "/")  # (S\NP)/NP
>>> cs.add_slash(cs["NP"], cs["N"], "/")      # NP/N
>>> parser = LambekParser(cs, n_terminals=100, start="S")
>>> tokens = torch.randint(0, 100, (4, 5))
>>> log_probs = parser(tokens)

LambekParser ¶

LambekParser(category_system: CategorySystem, n_terminals: int, start: str | Category = 'S', associative: bool = True, commutative: bool = False, enable_lifting: bool = True, enable_product: bool = True)

Bases: ChartParser

Weighted chart parser for type-logical grammar (Lambek calculus).

A convenience subclass of ChartParser that builds Lambek calculus rules from a CategorySystem and grammar options.

PARAMETER	DESCRIPTION
`category_system`	The finite category inventory. TYPE: `CategorySystem`
`n_terminals`	Vocabulary size. TYPE: `int`
`start`	The start category (default "S"). TYPE: `str or Category` DEFAULT: `'S'`
`associative`	Use associative Lambek calculus (default True). TYPE: `bool` DEFAULT: `True`
`commutative`	Use commutative product / LP calculus (default False). TYPE: `bool` DEFAULT: `False`
`enable_lifting`	Enable Lambek lifting rules (default True). TYPE: `bool` DEFAULT: `True`
`enable_product`	Enable product introduction (default True). TYPE: `bool` DEFAULT: `True`

Source code in src/quivers/stochastic/lambek.py

def __init__(
    self,
    category_system: CategorySystem,
    n_terminals: int,
    start: str | Category = "S",
    associative: bool = True,
    commutative: bool = False,
    enable_lifting: bool = True,
    enable_product: bool = True,
) -> None:
    # resolve start category
    if isinstance(start, str):
        start_cat = AtomicCategory(name=start)

    else:
        start_cat = start

    if start_cat not in category_system:
        raise ValueError(f"start category {start_cat!r} not in category system")

    # build Lambek rule system
    rules = lambek_rules(
        category_system,
        associative=associative,
        commutative=commutative,
        enable_lifting=enable_lifting,
        enable_product=enable_product,
    )

    super().__init__(
        rule_system=rules,
        n_terminals=n_terminals,
        start_idx=category_system.index(start_cat),
        category_system=category_system,
    )

    self._associative = associative
    self._commutative = commutative

Keys	Action
`?`	Open this help
`n`	Next page
`p`	Previous page
`s`	Search

Chart Parsers¶

parsers ¶

ChartParser ¶

rule_system property ¶

category_system property ¶

n_rules property ¶

n_unary_rules property ¶

log_lexicon property ¶

from_schema classmethod ¶

from_category_system classmethod ¶

forward ¶

inside_chart ¶

ccg ¶

CCGParser ¶

lambek ¶

LambekParser ¶

rule_system `property` ¶

category_system `property` ¶

n_rules `property` ¶

n_unary_rules `property` ¶

log_lexicon `property` ¶

from_schema `classmethod` ¶

from_category_system `classmethod` ¶