Chart Semirings¶
Semiring abstractions for parameterizing chart parsing (Goodman, 1999). Different semirings yield different parsing algorithms from the same CKY skeleton: marginal log-probability, Viterbi best-parse, boolean recognition, or derivation counting.
semiring
¶
Semiring abstractions for parameterizing chart parsing.
A semiring (S, ⊕, ⊗, 0, 1) provides the algebraic structure for chart parsing (Goodman, 1999). Different semirings yield different parsing algorithms from the same CKY skeleton:
- LogProbSemiring — marginal log-probability (logsumexp / +)
- ViterbiSemiring — best-derivation log-probability (max / +)
- BooleanSemiring — recognition (or / and)
- CountingSemiring — derivation counting (+ / ×)
Each semiring defines:
times(a, b)— multiplicative combination (⊗)plus(scores, dim)— additive aggregation (⊕) over a dimensionzero— additive identity (0): score for impossible derivationsone— multiplicative identity (1): score for trivial derivations
Categorical perspective
Semirings are the decategorification of rig categories (bimonoidal
categories). The times operation corresponds to the tensor
product, plus to the coproduct. Chart parsing is a functor
from the free monoidal category (grammar) to the semiring.
The connection to quantales: a quantale is a complete lattice with an associative binary operation distributing over arbitrary joins. The log-probability semiring is the quantale ([-∞, 0], logsumexp, +).
ChartSemiring
¶
Bases: ABC
Abstract semiring for chart parsing.
Subclasses implement the four semiring operations. All operations must be differentiable (where applicable) to support gradient-based learning.
zero
abstractmethod
property
¶
zero: float
Additive identity: the score of an impossible derivation.
one
abstractmethod
property
¶
one: float
Multiplicative identity: the score of a trivial derivation.
times
abstractmethod
¶
times(a: Tensor, b: Tensor) -> Tensor
Multiplicative combination (⊗).
Combines scores from two chart cells being joined by a rule.
| PARAMETER | DESCRIPTION |
|---|---|
a
|
Left operand scores.
TYPE:
|
b
|
Right operand scores.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Tensor
|
Combined scores. |
Source code in src/quivers/stochastic/semiring.py
46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 | |
plus
abstractmethod
¶
plus(scores: Tensor, dim: int) -> Tensor
Additive aggregation (⊕) over a dimension.
Marginalizes over multiple derivations of the same span.
| PARAMETER | DESCRIPTION |
|---|---|
scores
|
Scores to aggregate.
TYPE:
|
dim
|
Dimension to aggregate over.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Tensor
|
Aggregated scores. |
Source code in src/quivers/stochastic/semiring.py
65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 | |
plus_pair
abstractmethod
¶
plus_pair(a: Tensor, b: Tensor) -> Tensor
Binary additive combination (⊕) of two tensors.
Used for combining a cell with unary rule contributions.
| PARAMETER | DESCRIPTION |
|---|---|
a
|
First operand.
TYPE:
|
b
|
Second operand.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Tensor
|
Combined scores. |
Source code in src/quivers/stochastic/semiring.py
84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 | |
LogProbSemiring
¶
Bases: ChartSemiring
Log-probability semiring: (logsumexp, +, -∞, 0).
This is the default semiring for marginal parsing. times is
addition in log-space (= multiplication of probabilities) and
plus is logsumexp (= addition of probabilities in log-space).
ViterbiSemiring
¶
Bases: ChartSemiring
Viterbi semiring: (max, +, -∞, 0).
Finds the single best derivation. plus takes the max over
derivations rather than summing. Useful for Viterbi parsing
and best-first decoding.
BooleanSemiring
¶
Bases: ChartSemiring
Boolean semiring: (or, and, 0, 1).
For recognition only: determines whether a parse exists without computing scores. Values are 0.0 (false) and 1.0 (true).
CountingSemiring
¶
Bases: ChartSemiring
Counting semiring: (+, ×, 0, 1).
Counts the number of derivations. plus sums counts and
times multiplies them (product rule for independent choices).
Note: Not differentiable in a useful sense for gradient-based learning, but useful for analysis.