Deduction Systems¶
The user-facing surface for working with weighted chart
deductions: the model type, its abstract primitives, and three
orthogonal operations on it. All public symbols re-exported from
quivers.stochastic.deduction.
| Submodule | Job |
|---|---|
primitives |
Abstract building blocks (Axiom, Deduction, Goal, Schedule, DeductiveSystem). Most users do not touch these directly. |
fit |
Point-estimate gradient fitting (MAP / MLE) of the deduction's learnable log-weights. |
bayes |
Lift the parameters into a Bayesian MonadicProgram whose posterior NUTS / SVI can target. |
sample |
Exact length-conditional forward sampling of yields from the chart's distribution. |
deduction
¶
The deduction-system surface: model type, abstract primitives, and the three independent operations on a deduction.
This package gathers everything a user needs for working with a weighted deduction system in one importable path. The contents fall into three orthogonal groups:
- Model type (
DeductionSystem) — the concrete agenda-based chart deduction that compiles from adeductionblock in the QVR DSL. Re-exported fromquivers.stochastic.agenda. - Abstract primitives (
.primitives) —Axiom,Deduction,Goal,Schedule,DeductiveSystem: the protocol layer the agenda implementation derives from. Most users will not need these directly; they exist for custom-deduction subclasses and for the inside-algorithm framework (quivers.stochastic.inside). - Operations — three independent surfaces over a
DeductionSystemwith no overlap in purpose:.fit— point-estimate gradient fitting (MAP / MLE);.bayes— Bayesian wrapping for NUTS / SVI;.sample— exact length-conditional forward sampling of yields.
These three live in separate submodules because they answer different questions (estimate parameters at a point vs. sample a posterior vs. generate synthetic data) and so that adding more operations in the future does not bloat any one module.
DeductionSystem
dataclass
¶
DeductionSystem(rules: tuple[InferenceRule, ...], semiring: ChartSemiring, axiom_injector: Callable[[Any], list[tuple[Item, Tensor]]], goal: Callable[[Item], bool], agenda_factory: Callable[[], Agenda] = FIFOAgenda, chart_factory: Callable[[], Chart] = HashChart, max_iterations: int = 100000, tolerance: float = 0.0)
A weighted deductive system parameterized over its components.
The system is parameterized by:
- An item algebra (implicit in the patterns the rules use).
- A list of arity-n
InferenceRulehyperedges. - A
ChartSemiringfor weight aggregation. - An axiom injector
In -> [(Item, Weight)]producing the input's lexical / boundary items. - A goal predicate
Item -> boolselecting the result items. - (Optional) a chart constructor and an agenda strategy.
The same data structure subsumes CKY (FIFO agenda, span items, Boolean / inside semiring), Viterbi (priority agenda with the current weight as priority, max-times semiring), A* parsing (priority agenda with an admissible heuristic, tropical semiring), MLTT type-checking (LIFO agenda, judgment items, Boolean semiring), and weighted Datalog (FIFO, atoms, any naturally-ordered semiring).
run
¶
run(input_value: Any) -> AgendaResult
Run the deduction system on an input value.
Source code in src/quivers/stochastic/agenda.py
965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 | |
__call__
¶
__call__(input_value: Any) -> ChartView
Run the deduction and return a ChartView.
Convenience for the user-facing presheaf-evaluation API:
the chart's weights are differentiable tensors, and the
view exposes weight, enumerate, derivations,
and goal_weight methods for downstream programs.
Source code in src/quivers/stochastic/agenda.py
1040 1041 1042 1043 1044 1045 1046 1047 1048 | |
parameters
¶
parameters(recurse: bool = True) -> Iterable[Parameter]
Yield every learnable parameter owned by this system.
Walks the optional _axiom_module (lexicon log-weights)
and _rule_module (per-rule, per-binding log-weights)
submodules attached by the compiler. The recurse flag
is the standard torch.nn.Module.parameters signature
so user code can pass a DeductionSystem anywhere a
nn.Module parameter iterator is expected.
Source code in src/quivers/stochastic/agenda.py
1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 | |
named_parameters
¶
named_parameters(prefix: str = '', recurse: bool = True) -> Iterable[tuple[str, Parameter]]
Yield (name, parameter) pairs over all learnable parameters.
Source code in src/quivers/stochastic/agenda.py
1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 | |
Axiom
¶
Bases: Module
Initial items in a weighted deductive system.
An axiom creates the chart tensor and populates it with initial weights derived from the input. For span-based chart parsing, this is the lexical step.
forward
abstractmethod
¶
forward(input: Tensor, semiring: ChartSemiring) -> Tensor
Create and populate the initial chart.
| PARAMETER | DESCRIPTION |
|---|---|
input
|
Raw input (e.g. token indices).
TYPE:
|
semiring
|
The scoring semiring.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Tensor
|
The chart tensor with initial items filled in. |
Source code in src/quivers/stochastic/deduction/primitives.py
46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 | |
Deduction
¶
Bases: Module
A weighted inference step in a deductive system.
A deduction reads from the chart, applies inference rules, and writes updated weights. This is a morphism in the V-enriched category of chart states.
forward
abstractmethod
¶
forward(chart: Tensor, semiring: ChartSemiring, **context) -> Tensor
Apply this deduction step.
| PARAMETER | DESCRIPTION |
|---|---|
chart
|
Current chart state.
TYPE:
|
semiring
|
The scoring semiring.
TYPE:
|
**context
|
Schedule-provided context (e.g.
DEFAULT:
|
| RETURNS | DESCRIPTION |
|---|---|
Tensor
|
Updated chart. |
Source code in src/quivers/stochastic/deduction/primitives.py
77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 | |
DeductiveSystem
¶
DeductiveSystem(axiom: Axiom, deductions: list[Deduction], goal: Goal, schedule: Schedule, semiring: ChartSemiring | None = None)
Bases: Module
A weighted deductive system evaluated to fixpoint.
This is the abstract core of all parsing algorithms::
input -> axiom -> schedule(deductions) -> goal -> output
The system is an nn.Module whose learnable parameters come from the axiom (e.g. lexical weights) and deductions (e.g. rule weights).
| PARAMETER | DESCRIPTION |
|---|---|
axiom
|
Initial item population.
TYPE:
|
deductions
|
Inference rules.
TYPE:
|
goal
|
Result extraction.
TYPE:
|
schedule
|
Evaluation strategy.
TYPE:
|
semiring
|
Scoring algebra (default: LOG_PROB).
TYPE:
|
Source code in src/quivers/stochastic/deduction/primitives.py
188 189 190 191 192 193 194 195 196 197 198 199 200 201 | |
forward
¶
forward(input: Tensor) -> Tensor
Run the deductive system on input.
| PARAMETER | DESCRIPTION |
|---|---|
input
|
Raw input (e.g. token indices).
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Tensor
|
Goal item weights. |
Source code in src/quivers/stochastic/deduction/primitives.py
208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 | |
Goal
¶
Bases: Module
Extract the result from a completed chart.
A goal identifies which chart items constitute the answer and extracts their weights.
forward
abstractmethod
¶
forward(chart: Tensor) -> Tensor
Extract goal items from the chart.
| PARAMETER | DESCRIPTION |
|---|---|
chart
|
Completed chart.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Tensor
|
Goal item weights. |
Source code in src/quivers/stochastic/deduction/primitives.py
111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 | |
Schedule
¶
Bases: ABC
Evaluation strategy for a deductive system.
Different schedules compute the same fixpoint in different orders. CKY processes spans bottom-up; an agenda schedule uses a priority queue. The schedule is independent of the deduction rules.
run
abstractmethod
¶
run(chart: Tensor, deductions: ModuleList, semiring: ChartSemiring) -> Tensor
Execute deductions on chart to fixpoint.
| PARAMETER | DESCRIPTION |
|---|---|
chart
|
Chart with axioms filled in.
TYPE:
|
deductions
|
The deduction steps.
TYPE:
|
semiring
|
The scoring semiring.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Tensor
|
Completed chart. |
Source code in src/quivers/stochastic/deduction/primitives.py
137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 | |
nuts_program_from_deduction
¶
nuts_program_from_deduction(ded: DeductionSystem, corpus: Sequence[Sequence[str]], *, prior_scale: float = 1.0, site_prefix: str = 'log_w') -> tuple[MonadicProgram, Tensor, dict[str, Tensor]]
Lift a deduction system's learnable parameters to a
MonadicProgram suitable for NUTS / SVI.
The returned program has one
torch.distributions.Normal sample site per learnable
parameter (lexicon entries and rule bindings alike) plus one
score step that substitutes the sampled values into the
deduction's parameter slots and adds
:math:\sum_n \log Z(s_n; \mathbf{w}) to the joint.
| PARAMETER | DESCRIPTION |
|---|---|
ded
|
Deduction whose parameters are lifted.
TYPE:
|
corpus
|
Corpus the score step closes over.
TYPE:
|
prior_scale
|
Standard deviation of the Normal prior on each parameter.
TYPE:
|
site_prefix
|
Stem of each sample-site's name (the parameter's path is appended for round-trip mapping).
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
(model, x, observations)
|
The lifted program plus a |
Source code in src/quivers/stochastic/deduction/bayes.py
55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 | |
adam_fit_deduction
¶
adam_fit_deduction(ded: DeductionSystem, corpus: Sequence[Sequence[str]], *, steps: int = 300, lr: float = 0.05, prior_scale: float | None = None) -> list[float]
Maximise the corpus log-marginal under an optional Normal prior on the parameters.
| PARAMETER | DESCRIPTION |
|---|---|
ded
|
Deduction whose
TYPE:
|
corpus
|
Each sentence is a sequence of token strings the deduction's axiom injector accepts.
TYPE:
|
steps
|
Adam steps.
TYPE:
|
lr
|
Adam learning rate.
TYPE:
|
prior_scale
|
If supplied, adds a Gaussian regulariser
:math:
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
list[float]
|
The loss trajectory; length == |
Source code in src/quivers/stochastic/deduction/fit.py
28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 | |
sample_corpus
¶
sample_corpus(ded: DeductionSystem, *, length: int, n_samples: int, seed: int | None = None) -> list[list[str]]
Sample n_samples yields of length length from the
chart's length-conditional distribution under the deduction's
current parameters.
| PARAMETER | DESCRIPTION |
|---|---|
ded
|
The deduction with materialised parameters.
TYPE:
|
length
|
Length of yields to enumerate.
TYPE:
|
n_samples
|
Number of sentences to draw.
TYPE:
|
seed
|
Seed for the multinomial draws.
TYPE:
|
Source code in src/quivers/stochastic/deduction/sample.py
33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 | |
bayesian_lift_parameters
¶
bayesian_lift_parameters(inner_model: Module, x: Tensor, observations: dict[str, Tensor], *, prior_scale: float = 1.0, site_prefix: str = 'theta', additional_latents: dict[str, tuple[int, ...]] | None = None, latent_placeholder_scale: float = 10.0) -> tuple[MonadicProgram, Tensor, dict[str, Tensor]]
Lift every learnable parameter of inner_model into a
Normal-prior sample site, and optionally lift named latent
sites of the inner program into NUTS-sampleable variables.
Mathematics
Let :math:\theta denote the inner model's learnable
parameters and :math:\mathbf{z} an optional collection of
intermediate latents named in additional_latents. The
target joint posterior is
.. math:: p(\theta, \mathbf{z} \mid x, y) \;\propto\; p(\theta) \, p_{\mathrm{inner}}(\mathbf{z}, y \mid x, \theta).
The lifted program declares
- one Normal sample site
:math:
\theta_i \sim \mathcal{N}(0, \sigma_\theta^2)per parameter (prior_scale); - one Normal sample site per latent in
additional_latentswith a placeholder scale :math:\sigma_z(latent_placeholder_scale); and - one score step that, after substituting :math:
\thetainto the inner's parameter slots, computes
.. math:: \log p_{\mathrm{inner}}(\mathbf{z}, y \mid x, \theta) \;-\; \sum_{z \in \text{latents}} \log \mathcal{N}(z; 0, \sigma_z^2).
The placeholder priors on :math:\mathbf{z} cancel exactly,
so the lifted log-density equals
:math:\log p(\theta) + \log p_{\mathrm{inner}}(\mathbf{z}, y \mid x, \theta)
pointwise. NUTS therefore samples
:math:(\theta, \mathbf{z}) from the exact joint posterior,
and the log-density is deterministic given the full state
(no MC noise across leapfrog steps).
Methodological notes
Why a Normal prior on parameters?
- Maximum entropy. Among all distributions on
:math:
\mathbb{R}^nwith finite variance, :math:\mathcal{N}(0, \sigma_\theta^2)is the maximum-entropy choice. Among priors that admit any second moment at all, it is the least informative. - Equivalence to weight decay. A
:math:
\mathcal{N}(0, \sigma_\theta^2)prior is the MAP equivalent of L2 regularization with coefficient :math:1/(2\sigma_\theta^2). Standard frequentist weight decay and gradient-descent training inherit a direct Bayesian reading under this prior. - Computational fit with NUTS. The unconstrained support
:math:
\mathbb{R}^nmatches NUTS's native state space, so no bijector is needed between latent and prior support. The log-density is smooth everywhere, so leapfrog dynamics are well-behaved.
Assumptions the user must respect.
- Parameters must be unconstrained reals. If a learnable
represents a variance, rate, probability, or simplex
component, a Normal prior is mathematically invalid (it
places mass on the forbidden region). Models must use the
unconstrained parameterization (log-scale, logit-p,
log-rate, soft-max logits, etc.). In QVR's standard model
definitions, distribution families are parameterized in
this way and
torch.nn.Parameter\ s are unconstrained reals. - The default
prior_scale=1.0assumes O(1) parameter magnitude. This is consistent with typical neural-network initialization schemes (Xavier, Kaiming). Overrideprior_scalefor models with very different expected parameter magnitudes. - A Normal prior is generic, not informed. The lift is a
one-size-fits-all wrapper. Users with substantive domain
knowledge about :math:
\thetashould write aprogramblock with explicitsamplepriors instead of relying on the lift.
Why a placeholder Normal on lifted latents?
The placeholder prior on each
:math:\mathbf{z}_i \sim \mathcal{N}(0, \sigma_z^2) is
algebraically irrelevant by construction: the lifted
sample-site prior and the placeholder cancellation in the
score step sum to zero pointwise. The target distribution NUTS
samples is the true joint posterior regardless of
:math:\sigma_z. A placeholder exists at all because NUTS's
LatentRegistry enumerates dimensions from declared
sample sites; each site needs a base measure to define the
unconstrained support and to seed mass-matrix and step-size
adaptation during warmup. Normal is the standard choice for an
unconstrained latent.
:math:\sigma_z affects mixing efficiency, not
correctness. A placeholder scale mismatched to the true
posterior scale of :math:\mathbf{z} lengthens warmup (the
mass matrix has to adapt further) without biasing the chain.
The default latent_placeholder_scale=10.0 is large enough
that initial NUTS proposals span a meaningful neighbourhood of
zero, small enough that they do not immediately diverge.
| PARAMETER | DESCRIPTION |
|---|---|
inner_model
|
Module exposing
TYPE:
|
x
|
Passed straight through to the inner's
TYPE:
|
observations
|
Passed straight through to the inner's
TYPE:
|
prior_scale
|
:math:
TYPE:
|
site_prefix
|
Stem of each parameter sample-site's name.
TYPE:
|
additional_latents
|
Mapping from intermediate-latent site name (a key the
inner's
TYPE:
|
latent_placeholder_scale
|
:math:
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
(model, x_, observations_)
|
The lifted |
Source code in src/quivers/inference/lifts.py
155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 | |