DSL Programs and Let-Expressions¶
This page covers the program block surface: program declarations
and their effect signatures, the bind / observe / marginalize / let
steps that make up a program body, the axis-role clause that
configures structured priors and likelihoods, the let-expression
language with its full primitive surface, and factor expressions.
The grammar summary lives in the DSL overview; declarations of objects, morphisms, kernels, and combinators are detailed in DSL Declarations.
Program declarations¶
A program block defines a probabilistic program. The body is a
sequence of steps (bind, observe, let, marginalize) followed by
return. Each step is a Kleisli
arrow on the
accumulated random-variable context \(\Phi\); the program denotes the
composite \(\Gamma \to \mathcal{G}(\tau_2)\) in
Kern.
program my_prog : X -> Y
sample mu <- LogitNormal(0.0, 1.0)
sample x <- Normal(mu, 1.0)
return x
program with_params(a, b) : (X * Z) -> Y
let w = a
sample x <- f(w)
sample y <- g(x, b)
return y
The compiled program is a
MonadicProgram;
see the monadic programs guide for the runtime
contract (rsample, log_joint, the observations dict).
Effect signatures¶
A program declaration may carry an effects = [...] entry in its
option block listing a subset of {Sample, Score, Marginal, Pure}.
The compiler verifies that the body's actual effects are a subset
of the declared set; effects=[Pure] rejects any sample, score, or
marginal binds.
program prior : Unit -> Y [effects=[Sample]]
sample mu <- Normal(0.0, 1.0)
return mu
program deterministic : X -> X [effects=[Pure]]
let y = x
return y
See the compositional effects guide for the algebraic basis of the effect surface.
Kleisli bind syntax¶
The <- operator is the unique sampling-step sigil in a program
body:
x <- Normal(0.0, 1.0)
It introduces x as a random variable distributed according to the
given family. The same sigil carries every sampling-step variant:
scalar draws, indexed plates, scored observes, and scoped
marginalizations, distinguished by the surrounding shape.
Indexed bind (plate)¶
v : A <- Family(args) declares v as an \(A\)-indexed family of
independent \(F\)-distributed draws. Categorically v : A \to
\mathcal{G}(K) where K is the per-fiber codomain taken from the
family; equivalently a single arrow \(\mathbf{1} \to
\mathcal{G}(K^A)\) via the natural isomorphism
\(\mathbf{Kern}(\mathbf{1}, K^A) \cong \mathbf{Kern}(A, K)\).
object Item : FinSet 1000
sample duration_incr : Item <- HalfNormal(1.0)
sample by_subject : Subject <- Normal(0.0, sigma)
Indexed observe¶
observe r : N <- Family(args) accumulates a batched
log-likelihood: a sub-probability kernel \(\Phi \to
\mathcal{G}_{\le 1}(\Phi)\) with score \(\prod_{n \in N} p_F(r_{\mathrm{obs}}(n);
\theta(n, \phi))\). The response buffer r is supplied at runtime
via the observations dict passed to
MonadicProgram.rsample,
MonadicProgram.log_joint,
or ELBO.forward. Family arguments may
use bracket-indexed sections theta[N] to refer to plate variables.
observe cloze_resp : RespCloze <- Bernoulli(intercept_cloze)
Scoped marginalize¶
marginalize c : A <- F(args) followed by an indented body
introduces a coordinate c bound to a kernel F(args),
optionally A-indexed, with the indented block as its
integration scope. At the end of the scope the coordinate is
pushed forward through the projection \(\pi : \Phi \times C \to
\Phi\), integrating it out by
log-sum-exp on the
log-likelihood (discrete) or fibrewise integration (continuous);
c then falls out of scope.
marginalize class : Item <- Categorical(class_logits)
observe r : N <- Bernoulli(theta[class[N]])
The grouped form with [over=G] and per-observe [via=<idx>]
options is the fibred marginalization construct; see
hierarchical programs.
Indexed gather in let¶
A let-expression of the form arr[idx] denotes the Kleisli
pullback of a plate
variable along a finite fibration. For a plate v : A -> B and an
index morphism \(\iota : N \to A\), the gather \(\iota^* v = v \circ
\iota\) is itself a Kern-morphism
\(N \to B\).
by_verb : Verb <- Normal(0.0, sigma)
let intercept_for_item = by_verb[verb_of_item]
Parametric programs¶
A program declaration whose parameter list contains typed
parameters denotes a dependent family of kernels rather than a
single kernel:
Three parameter universes are available:
| Kind | Universe | Quantifies over |
|---|---|---|
FinSet, Space, Object |
object of the relevant subcategory | the carrier of a plate |
Real, Nat |
hom-object of scalar type | a hyperparameter value |
Mor[A, B] |
the hom-set \(\mathbf{Kern}(A, B)\) | a kernel passed in by name |
Parametric programs are not compiled to runtime MonadicPrograms
in isolation; the compiler stores them as templates and inlines
them at each call site:
v <- template(arg1, arg2, ...)
At each call site the template's body is substituted (formal
parameters to actual arguments) and α-renamed (internal latents are
prefixed by v$, the return variable is renamed to v directly).
The renamed step list is inlined into the caller, so distinct call
sites contribute distinct factors to the parent's joint kernel:
fresh latents per use, no inadvertent tying.
# Parametric random-intercepts template: one HalfNormal scale and
# a per-level Normal(0, sigma) plate, polymorphic over the grouping
# object G and the half-normal hyperparameter scale.
program random_intercepts (G : FinSet, scale : Real) : G -> 1
sample sigma <- HalfNormal(scale)
sample v : G <- Normal(0.0, sigma)
return v
Worked hierarchical examples (crossed random intercepts, grouped marginalization over discrete classes) live in the hierarchical programs guide.
Posterior blocks¶
A program name(latents) : domain -> codomain
declaration denotes a deterministic post-conditioning kernel. The
over = model entry in its option block marks the program as
consuming the named model's latents; the consumed latents appear
as data parameters in the parameter list. The
effects=[Pure] signature rejects any sample, score, or
marginal binds; the body is restricted to let (and marginalize
over its own scope). Categorically it is a
Kern-morphism \(\text{Latents}
\to \tau_{\mathrm{out}}\) that lifts to \(\text{Data} \to
\mathcal{G}(\tau_{\mathrm{out}})\) by post-composition with the
model's posterior kernel \(q(\theta \mid \mathrm{data})\).
object Logits4 : Real 4
program scored : Item -> Logits4
sample raw_logits <- Normal(0.0, 1.0)
return raw_logits
program class_probs(raw_logits) : Item -> Logits4 [effects=[Pure], over=scored]
let probs = softmax(raw_logits)
return probs
The data parameter raw_logits names the model latent the body
consumes; the runtime supplies a per-sample snapshot of the model's
trace.
Axis-role clause: over and iid_over¶
Every distribution-bearing form (kernel and latent-prior morphism
declarations, sample / observe / marginalize steps) accepts
two option-block keys that configure how the family's axes
decompose: over and iid_over. Both live inside the bracketed
option block, before any trailing ~ Family(...) initializer:
... [..., over=<axes>, iid_over=<axes>] ~ Family(args)
over=<axes> names the event axes: the axes on which the
family's joint structure lives. The axis count must match the
family's declared event_rank (0 for scalar families like Normal /
Beta / Gamma; 1 for vector families like
MultivariateNormal,
Dirichlet,
ConditionalGaussianProcess,
or
ConditionalHorseshoe;
2 for matrix families like
MatrixNormal,
Wishart,
InverseWishart).
A single axis is written bare (over=A); multiple axes use a
bracketed list (over=[A, B]). The positional ordering of over
axes corresponds positionally to the family's declared event-axis
ordering (for asymmetric families like MatrixNormal, the first
axis is the row axis, the second the column axis). The full
event-rank table lives in
continuous families.
iid_over=<axes> is an optional readability assertion naming the
batch axes (the complement of over). Any axis not in over is
batched by default, which categorically is a product of independent
distributions on that axis.
Axis names. Names resolve against the named factors of the
surrounding morphism's dom and cod (or the type annotation : T
on a sample / observe step). The reserved tokens dom and cod
are shortcuts when that side is a single unfactored object; for a
product-typed side, every factor must be named explicitly.
Categorical reading. The surface preserves the distinction
between joint-on-a-product-space (the family's event with possibly
non-trivial correlation) and product-of-independents (iid batches
across an axis), and between a flat MVN over \(\dim(A) \cdot \dim(B)\)
with dense covariance versus a
MatrixNormal
with Kronecker structure
\(V \otimes U\). There is no auto-substitution between families with
different event ranks. Renaming or refactoring a morphism's type
invalidates axis references at type-check time rather than silently
rebinding.
Compile-time errors. Two diagnostics fire here:
- "axis count does not match family event_rank": the family
declares event_rank
kbutoverlistsj ≠ kaxes. Common when migrating from a flatMultivariateNormal(event_rank 1) to a matrixMatrixNormal(event_rank 2). - "unknown axis name(s)
Ainover": anovername is not a factor of the morphism's dom or cod. Resolve by renaming, or by usingdom/codif the side is a single unfactored object.
# Vector prior: 5-dim MVN over the codomain axis.
sample mu : Real 5 <- MVN(zeros, L) [over=cod]
# Matrix prior on a morphism: Kronecker MatrixNormal.
morphism W : Real 32 -> Real 64 [role=latent, over=[dom, cod]]
~ MatrixNormal(loc, row_scale, col_scale)
# Per-row Dirichlet on a transition kernel: each row is a K-dim
# simplex independently, rows are iid.
morphism T : Real K -> Real K [role=latent, over=cod, iid_over=dom]
~ Dirichlet(alpha)
# MVN response per observation row.
observe y : N <- MVN(mu_hat, scale_tril) [over=cod]
Let expressions (arithmetic and primitives)¶
Inside a program block, let bindings support full arithmetic
with standard operator precedence, unary negation, and a fixed pool
of built-in tensor primitives:
# arithmetic: +, -, *, /
let eta = mu + sigma * z_raw + lambda * shared_factor
let adjusted = (1.0 - lapse) * p_raw + 0.5 * lapse
let mean = (x + y + z) / 3.0
let negated = -raw_score
# built-in primitives
let prob = sigmoid(eta)
let positive = softplus(raw)
let log_rate = log(rate)
let magnitude = abs(x - 0.5)
let monotone = cumsum(increments)
let weights = softmax(logits)
let regularized = dropout(layer_norm(features))
Each let-builtin denotes a deterministic measurable map, lifted
into the Kleisli category as a Dirac
kernel.
Primitive reference¶
Reductions and shape-preserving operations on the last axis default
to dim=-1, the natural choice for per-row operations in V-Cat
morphisms; for contractions over a specific named axis, use the
typed contraction declaration.
Activations (torch.nn.functional surface): relu, relu6,
leaky_relu, prelu, rrelu, elu, selu, celu, gelu,
silu (alias swish), mish, hardtanh, hardshrink,
hardsigmoid, hardswish, softplus, softshrink, softsign,
tanh, tanhshrink, sigmoid, logsigmoid, threshold, glu.
Simplex / normalization maps: softmax, log_softmax,
softmin, normalize.
Pointwise transcendentals: exp, expm1, log, log1p,
log2, log10, sqrt, rsqrt, square, abs, neg, sign,
reciprocal, clamp, sin, cos, tan, asin, acos, atan,
sinh, cosh, asinh, acosh, atanh, floor, ceil, round,
trunc, erf, erfc, erfinv, lgamma, digamma.
Last-axis reductions: sum, mean, var, std, min, max,
argmin, argmax, prod, amax, amin, logsumexp, norm.
Last-axis shape-preserving: cumsum, cumprod, cummax,
cummin, flip, sort.
Training-mode primitives: dropout, alpha_dropout,
layer_norm, rms_norm.
The compiled implementation is _LET_EXPR_BUILTINS in
quivers.dsl.compiler.programs. Function
calls inside a let body resolve against this builtin table first,
then against module-scope callables (programs, morphisms,
encoders, decoders, deductions); arity is checked at compile time.
Factor expressions: assembling indexed tensors¶
The factor expression in a let body builds a
finite-domain-indexed tensor by evaluating a body once per tuple of
index values. Categorically it is the left adjoint of indexing:
while arr[i, j, ...] is the elimination rule for I_1 × ... × I_n
-> body_type (Kleisli pullback of a plate variable along a finite
fibration), factor is the introduction rule.
Uniform form¶
factor v_1 : I_1, v_2 : I_2, ..., v_n : I_n in <body>
denotes the tensor of shape (|I_1|, ..., |I_n|, *body_shape)
whose value at position (i_1, ..., i_n) is <body> evaluated
with v_k := i_k. The binder variables are integer-valued and
visible only inside the body.
object Verb : FinSet 40
object Class : FinSet 4
# Per-verb, per-class scoring table: shape (40, 4).
let cell = factor v : Verb, cls : Class in coef[v, cls] * weight[cls]
Pattern-match form (single-axis)¶
When each cell of a single-axis factor carries a structurally different expression, the case form lets you state the per-index expressions side-by-side:
let class_probs = factor cls : Class in {
0 -> (1.0 - prob_dur) * (1.0 - prob_telic_nodur),
1 -> (1.0 - prob_dur) * prob_telic_nodur,
2 -> prob_dur * (1.0 - prob_telic_dur),
3 -> prob_dur * prob_telic_dur,
}
The case labels must cover {0, ..., |Index|-1} exactly; the
compiler rejects gaps, duplicates, or out-of-range labels at
compile time. Braces and comma separators delimit the cases. This
is the natural surface for structured categorical priors: each
cell of a Class-shape probability vector is built from a
different combination of upstream scalar latents.
Inline distributions¶
Bind and observe steps support inline distribution construction with any mix of literal and variable arguments. The 30+ registered families accept literal-or-variable arguments at any position:
# all-literal (fixed): Unit -> codomain
x <- Normal(0.0, 1.0)
p <- Beta(2.0, 5.0)
# all-variable (direct): variables -> codomain
y <- Normal(mu, sigma)
b <- Bernoulli(theta)
# mixed literal / variable: any combination works
h_cand <- Normal(reset_hidden, 0.5)
z <- Normal(0.0, learned_scale)
r <- TruncatedNormal(mu, sigma, 0.0, 1.0)
# negative literals
z <- Normal(-1.5, 0.3)
A representative subset of the inline-distribution registry (the full set is documented in continuous families):
| Family | Parameters | Codomain |
|---|---|---|
Normal |
loc, scale |
Euclidean |
LogitNormal |
mu, sigma |
UnitInterval |
Uniform |
low, high |
UnitInterval / Euclidean |
Bernoulli |
probs |
FinSet(2) |
Beta |
concentration1, concentration0 |
UnitInterval |
Exponential |
rate |
PositiveReals |
HalfCauchy |
scale |
PositiveReals |
HalfNormal |
scale |
PositiveReals |
LogNormal |
loc, scale |
PositiveReals |
Gamma |
concentration, rate |
PositiveReals |
Dirichlet |
concentration |
Simplex (codomain dim / cardinality) |
TruncatedNormal |
mu, sigma, low, high |
Euclidean (bounded) |
MultivariateNormal |
loc, scale_tril |
Euclidean |
MatrixNormal |
loc, row_scale, col_scale |
matrix Euclidean |
LKJCholesky |
concentration |
Cholesky-factor manifold |
Wishart, InverseWishart |
df, scale_tril |
positive-definite matrices |
Horseshoe |
scale |
Euclidean (sparse-shrinkage prior) |
Every parameter position in every family accepts either a literal value or a previously-bound variable. When all arguments are literals, a fixed distribution is created; when any argument is a variable, the family is resolved at runtime against the current trace's values.
For conditional distributions (input-conditional, learned-parameter
form), use a morphism f : A -> B [role=kernel] ~ Family
declaration instead.
Examples¶
Simple discrete model¶
object X : FinSet 3
object Y : FinSet 4
morphism f : X -> Y [role=latent]
morphism g : Y -> Y [role=latent]
let fg = f >> g
export fg
Continuous conditional model¶
object Cond : FinSet 2
object Latent : Real 3
object Obs : Real 5
morphism prior : Cond -> Latent [role=kernel] ~ Normal
morphism likelihood : Latent -> Obs [role=kernel] ~ Normal [scale=0.1]
let posterior = prior >> likelihood
export posterior
Probabilistic program with observations¶
object Data : FinSet 1
object Y : Real 2
program regression : Data -> Y
sample theta <- LogitNormal(0.0, 1.0)
sample y <- Normal(theta, 0.5)
observe _ <- Normal(y, 0.1)
return y
Init recipe end-to-end¶
object D : FinSet 1
object In : Real 8
object Out : Real 4
# Default: weights initialised from a small-scale randn.
morphism W_default : In -> Out [role=latent, scale=0.1]
# Algebra-guided: init pulled from Algebra.init_spec, applied
# through the active algebra's bijector. Under product_fuzzy
# (the default), the raw parameter is set to logit(ln(2) / depth);
# samples land near the algebra's neutral element on draw zero.
morphism W_auto : In -> Out [role=latent, init=auto]
export W_auto
For more examples, see the
Examples Gallery. For a formal account of
what .qvr programs mean, see the
Denotational Semantics.