Morphisms¶
This page assigns denotations to the three QVR morphism strata (discrete \(\mathcal{V}\)-enriched, stochastic, and continuous) and to the structural transitions between them. Throughout, we fix an algebra \(\mathcal{V}\) as in Setting and notation. The corresponding typing judgments and inference rules for morphism expressions (\(\Gamma; \Phi \vdash e : A \rightsquigarrow B\)) live in Typing §5.
0. The unified morphism declaration¶
Every morphism, kernel, latent, observed, embed, and discretize binding ships through a single declaration form
morphism f : DOM -> COD [k = v, ...] [~ INIT]
with the role selected by the option block:
role=... |
Stratum | Initialiser admitted | Default initializer |
|---|---|---|---|
latent |
learnable parameter on the source category | ~ Family(...) |
Normal prior on raw logits |
kernel |
parameter-driven distribution-family kernel | ~ Family(...) |
family-specific |
observed |
fixed-data morphism | ~ expr |
required |
embed |
section of a discretization | ~ Family(...) (sample-from-cell) |
family-specific |
discretize |
quotient kernel | ~ expr (partition) |
uniform-quantile |
let |
deterministic morphism (alias for ~ expr) |
~ expr |
required |
Other option-block keys carry per-role configuration: scale (initial parameter scale for latent and kernel), init (named initialization regime), bins (discretize), replicate=N (allocate \(N\) independently-parameterized copies under names f_0, …, f_{N-1} with a group binding \(f\)), and the axis-role keys (over, iid) consumed by the family-prior surface of §6.
The remainder of this page uses the legacy keyword form (latent, kernel, embed) when illustrating individual strata; every snippet desugars to the unified form by morphism f : … [role=KIND, …].
1. Discrete \(\mathcal{V}\)-enriched morphisms¶
A discrete morphism declaration
morphism f : τ₁ -> τ₂ [role=latent]
denotes a morphism of \(\mathcal{V}\text{-}\mathbf{Rel}\):
For morphism f : τ₁ -> τ₂ [role = latent], the entries are free parameters drawn from \(V\); the realization in PyTorch is a tensor of requires_grad = True parameters passed through a constraint map \(\sigma : \mathbb{R} \to V\) (the sigmoid for \(\mathcal{V}_{\mathrm{pf}}\), the identity for \(\mathcal{V}_{\mathrm{T}}\), etc.).
For morphism g : τ₁ -> τ₂ [role = observed] ~ data, the entries are fixed: \(\llbracket g \rrbracket(x, y) = \mathrm{data}[x, y]\).
The implementation does not distinguish the two cases at the categorical level: both produce objects of \(\mathcal{V}\text{-}\mathbf{Rel}\). The latent/observed distinction is operational, controlling the gradient-flow during training.
1.1 Composition, tensor, identity¶
Composition \(;\), tensor \(\boxtimes\), and identity \(1_X\) in \(\mathcal{V}\text{-}\mathbf{Rel}\) are defined as in Setting and notation §2. The DSL operators correspond directly:
| Syntax | Denotation | Definition |
|---|---|---|
f >> g |
\(\llbracket g \rrbracket \circ \llbracket f \rrbracket\) | \((f; g)(x, z) = \bigoplus_y f(x, y) \otimes g(y, z)\) |
f @ g |
\(\llbracket f \rrbracket \boxtimes \llbracket g \rrbracket\) | \((f \boxtimes g)((x_1, x_2), (y_1, y_2)) = f(x_1, y_1) \otimes g(x_2, y_2)\) |
identity(X) |
\(1_{\llbracket X \rrbracket}\) | \(1_X(x, x') = \mathbf{1}\) if \(x = x'\), \(\bot\) otherwise |
Proposition (Categorical structure). Assume \(\mathcal{V}\) is a strict quantale (Setting §1): \(\otimes\) distributes over arbitrary joins \(\bigoplus\) on both sides. Then \(\mathcal{V}\text{-}\mathbf{Rel}\) is a symmetric monoidal category, with \(;\) as composition, \(1_X\) as identities, \(\boxtimes\) as monoidal product, \(\mathbf{1}\) (the singleton) as monoidal unit, and the braid \(\sigma_{X, Y}(x, y) = \mathbf{1}\) iff coordinates swap. Under the same hypothesis \(\mathcal{V}\text{-}\mathbf{Rel}\) is moreover compact closed with every object self-dual. For the lax \(\mathcal{V}_{\mathrm{pf}}\) / \(\mathcal{V}_{\mathrm{L}}\) algebras (where distributivity is sub-equational, Algebras §2.1) the same diagrams commute laxly rather than strictly; the chart parser and SVI use the lax denotation but the equational claims of this chapter require the strict hypothesis.
Proof. Associativity of \(;\) unfolds to $$ ((f; g); h)(x, w) \;=\; \bigoplus_z \Bigl( \bigoplus_y f(x, y) \otimes g(y, z) \Bigr) \otimes h(z, w). $$ The strict-quantale distributivity law of Setting §1 commutes the outer \(\otimes h(z, w)\) past the inner join over \(y\), giving \(\bigoplus_z \bigoplus_y f(x, y) \otimes g(y, z) \otimes h(z, w)\); the associativity of \(\otimes\) (standing assumption on \(\mathcal{V}\) as a commutative monoid, Setting §1) means the bracketing of the threefold tensor is immaterial. The join's universal-colimit property collapses \(\bigoplus_z \bigoplus_y = \bigoplus_{(y, z)}\) and the two-variable joins commute by the same property (a colimit cone is determined by its components in any order), so we can re-bracket as \(\bigoplus_y f(x, y) \otimes \bigoplus_z g(y, z) \otimes h(z, w)\), applying distributivity once more to absorb \(f(x, y)\) into the inner join. This is \((f; (g; h))(x, w)\). Without strict distributivity, the equation degrades to an inequality \(((f; g); h) \le (f; (g; h))\), so the associativity isomorphism is lax. The identity laws use \(1_X(x, x') = \mathbf{1}\) iff \(x = x'\) and \(\bot\) otherwise to collapse one join to a single term, using \(\bot \otimes a = \bot\) (absorption) for the off-diagonal entries: this requires \(\bot\) to be the bottom of the lattice, which holds in every strict quantale. Symmetry of \(\boxtimes\) follows from commutativity of \(\otimes\) (a standing assumption on \(\mathcal{V}\), Setting §1). Compact closure is established in Expressions §2.9 under the strict-quantale hypothesis. \(\square\)
1.2 Marginalization¶
For \(f : X \otimes Y \to Z\) in \(\mathcal{V}\text{-}\mathbf{Rel}\), the expression f.marginalize(X) denotes the \(\mathcal{V}\)-enriched colimit (algebra-join) of \(f\) along the \(X\)-coordinate:
a morphism \(Y \to Z\). Equivalently, it is the postcomposition with the \(\mathcal{V}\)-relation \(\top_X : X \to \mathbf{1}\) that sends every \(x\) to \(\mathbf{1}\), after the canonical reassociation \(X \otimes Y \cong Y \otimes X\).
2. Stochastic morphisms¶
A kernel declaration with finite-set codomain and no ~ Family clause,
kernel kern : τ₁ -> τ₂
denotes a morphism of \(\mathbf{Stoch}\), i.e.\ a row-stochastic \(|{\llbracket \tau_1 \rrbracket}| \times |{\llbracket \tau_2 \rrbracket}|\) matrix (each row \(\llbracket \mathrm{kern} \rrbracket(x, \cdot)\) is a probability distribution over \(\llbracket \tau_2 \rrbracket\)):
The implementation realizes \(\llbracket \mathrm{kern} \rrbracket\) via softmax over a parameter tensor; this is the canonical surjection \(\mathbb{R}^{|Y|} \twoheadrightarrow \Delta^{|Y|-1}\) from raw logits onto the simplex.
Composition is the Kleisli composition for \(\mathcal{G}_{\mathrm{fin}}\):
Note that \(\mathbf{Stoch}\) is not a sub-category of \(\mathcal{V}_{\mathrm{pf}}\text{-}\mathbf{Rel}\): their composition operations differ. \(\mathcal{V}_{\mathrm{pf}}\text{-}\mathbf{Rel}\) uses noisy-OR aggregation (the t-conorm \(\bigoplus a_i = 1 - \prod (1 - a_i)\)) while \(\mathbf{Stoch}\) uses ordinary summation, which is the correct aggregation for events partitioning the sample space. The two categories share an underlying tensor representation in \([0, 1]^{|X| \times |Y|}\), but the categorical structure is supplied by different binary operations.
2.1 Distribution-family morphisms¶
Continuous distribution families parameterized by a finite set, declared with
kernel f : τ₁ -> σ ~ Family
denote morphisms of \(\mathbf{Kern}\):
where \(\theta : \llbracket \tau_1 \rrbracket \to \Theta\) is the family's parameter map (typically a neural network), and \(p_{\mathrm{Family}}(\cdot \,;\, \theta)\) is the density of the family at parameter \(\theta\). The QVR-supplied family registry catalogs the pairs \((\Theta, p)\) for each name (Normal, Beta, Dirichlet, …).
3. Continuous morphisms¶
A kernel declaration whose source is itself a space, e.g.
kernel g : σ₁ -> σ₂ ~ Family
denotes a Markov kernel between standard Borel spaces:
Composition is the Chapman–Kolmogorov integral
realized numerically by Monte-Carlo or sampled-composition approximation in the implementation.
4. Tensor product across strata¶
The @ combinator extends to all three strata via the canonical monoidal structures:
| Strata of \(f\) and \(g\) | Ambient category of \(f \otimes g\) |
|---|---|
| Both discrete | \(\mathcal{V}\text{-}\mathbf{Rel}\) |
| Both stochastic | \(\mathbf{Stoch}\) |
| Both continuous | \(\mathbf{Kern}\) |
| Mixed | The smallest enclosing category, via the canonical embeddings \(\mathcal{V}_{\mathbb{B}}\text{-}\mathbf{Rel}_{\mathrm{fun}} \hookrightarrow \mathbf{Stoch} \hookrightarrow \mathbf{Kern}\) of Setting §4 |
The denotation is the parallel product of kernels:
extended uniquely (by the standard product-measure construction) to the product \(\sigma\)-algebra.
4a. Replicated kernels and option blocks¶
4a.1 Replication¶
A kernel f[n] : A -> B [~ Family ...] declaration (note the bracketed integer after the kernel name) introduces \(n\) independently-parameterized kernels sharing one declared signature, registered in the environment under synthesized names \(f\_0, f\_1, \dots, f\_{n-1}\) along with a group binding \(f \mapsto (f\_0, \dots, f\_{n-1})\). Denotationally,
each a separate morphism in the appropriate stratum with its own learnable parameters. The bare identifier \(f\) is not itself a morphism; it is a group reference accepted by expressions that admit splicing (notably \(\mathsf{fan}\), Expressions §3.1), where it expands to the comma-separated list of its members. Outside a splice site, referring to \(f\) alone is a compile-time error.
The same form is admitted on embed declarations with the same group semantics.
4a.2 Option blocks¶
A morphism / kernel / discretize declaration may carry a bracketed option block
after the codomain, listing family- or boundary-specific keyword overrides. The option block is denotationally inert at the categorical level: its sole effect is to override default values of the family's parameter map (bins for discretize, distribution-specific clipping bounds for kernels, etc.). Two declarations with identical signatures and different option blocks denote different morphisms whose tensors are computed from differently-configured parameter networks but are otherwise of the same categorical shape.
5. Stratum transitions¶
The discretize and embed declarations witness the canonical functors between strata.
5.1 Discretization¶
discretize d : σ -> n
denotes the quotient kernel induced by a measurable partition of \(\llbracket \sigma \rrbracket\) into \(n\) Borel cells \(B_0, \dots, B_{n-1}\):
i.e.\ the deterministic kernel sending each \(s\) to the (unique) cell containing it. As a morphism of \(\mathbf{Kern}\) it factors through \(\iota(\{0, \dots, n-1\})\). The choice of partition is supplied by the family annotation (e.g.\ uniform quantiles, learned thresholds).
5.2 Embedding¶
embed e : n -> σ
denotes a section of a discretization: a kernel \(e : \iota(\{0, \dots, n-1\}) \to \llbracket \sigma \rrbracket\) such that the composite \(e ; d = \mathrm{id}\). In the implementation this is typically realized by sampling from a per-cell embedding family \(p_{\mathrm{Family}}(\cdot \,;\, \theta_i)\).
The pair \((d, e)\) is a retraction in the categorical sense; it is not generally an isomorphism (information about the within-cell distribution is lost by \(d\)).
6. Axis-role specifications¶
Each registered family carries a declared event rank \(r_F \in \mathbb{N}\).
| Event rank | Family names | Event shape |
|---|---|---|
| 0 | Normal, LogitNormal, Beta, TruncatedNormal, Uniform, Bernoulli, RelaxedBernoulli, Binomial, Categorical, OneHotCategorical, GeneralizedPareto, HalfCauchy, HalfNormal, LogNormal, Exponential, Gamma |
\(\mathbb{R}\) (scalar) |
| 1 | MultivariateNormal, LowRankMVN, Dirichlet, LogisticNormal, RelaxedOneHotCategorical, GP, Horseshoe |
\(\mathbb{R}^{d}\) for a single named axis |
| 2 | Wishart, LKJCholesky |
\(\mathbb{R}^{d_1 \times d_2}\) for two named axes |
Every family in the table is installed in the unified family catalog by _register_family (directly for the bespoke families, via _make_family for the auto-generated wrappers around torch.distributions). Each is therefore equally usable as a conditional morphism ([role=kernel] / [role=latent] with a ~ Family(args) initializer) and as an inline draw site (sample x <- Family(args)). The parameter map, support, and log_prob semantics are uniform across the two call paths.
A distribution clause ~ F(args) over <axes> [iid over <axes>] configures the event–batch decomposition of a \(F\)-valued draw. Concretely, for a morphism \(f : A \to B\) whose representing tensor has shape \(\prod_{i} d_i\) indexed by the named factors \(\{a_1, \dots, a_m\}\) of \(A\) and \(\{b_1, \dots, b_n\}\) of \(B\), the clause names a sub-multiset \(E \subseteq \{a_i\} \cup \{b_j\}\) of cardinality \(|E| = r_F\) and declares:
an iid product over the batch axes \(\{a_i\} \cup \{b_j\} \setminus E\) of an \(F\)-distributed draw on the event axes \(E\). Categorically, the iid axes are the product structure of the kernel's domain, while the event axes carry the family's joint (possibly correlated) distribution.
Arity contract. The axis-role clause is well-typed only when \(|E| = r_F\). This preserves a critical categorical distinction: a flat \(\mathrm{MVN}_{d_1 d_2}\) with dense covariance over \(\mathbb{R}^{d_1 d_2}\) (event rank 1, single named axis whose dim equals \(d_1 \cdot d_2\)) is a different morphism from a \(\mathrm{MatrixNormal}(d_1, d_2)\) with Kronecker covariance \(V \otimes U\) over \(\mathbb{R}^{d_1 \times d_2}\) (event rank 2, two named axes); no auto-substitution between them occurs.
Positional binding. For families with two distinguishable event axes (row and column for MatrixNormal; the two correlation indices for LKJCholesky), the ordering of over (e_1, e_2) corresponds positionally to the family's declared event-axis ordering.
Naturality. Refactoring a morphism's dom or cod (e.g. \(B \mapsto B_1 \otimes B_2\)) invalidates the axis references at type-check time rather than silently rebinding; this is the price of the surface preserving categorical structure under refactoring.
The dom and cod shortcuts are legal in \(E\) only when the corresponding side of the morphism is a single unfactored object; for a product-typed side, every factor must be named explicitly, since silently flattening a categorical product into an opaque single axis would erase the product structure.
7. Latent morphism priors¶
The discrete morphism declaration of §1 extends with a parameter prior via the ~ Family(args) [axis_role_clause] clause:
morphism f : τ₁ -> τ₂ [role=latent] ~ Family(args) [over <axes> [iid over <axes>]]
The clause declares that the representing tensor of \(f\) is itself a random variable drawn from the named family at the requested axis-role configuration. Concretely, the denotation desugars to the composite
where rsample is the reparameterized sample of the family at its declared event/batch shape, and "wrap as morphism" is the canonical identification of a tensor of shape \(|\tau_1| \times |\tau_2|\) (or the equivalent under the axis specification) with a morphism \(\tau_1 \to \tau_2\) in \(\mathcal{V}\text{-}\mathbf{Rel}\).
Independence default. Without a prior clause, the morphism's representing tensor is a free parameter (a point in \(V^{|\tau_1| \times |\tau_2|}\)); the latent / observed distinction of §1 controls only whether the gradient flows. With a prior clause, \(f\) becomes a sample site whose value is drawn afresh at each forward pass; the inference layer integrates \(f\) as if it were an explicit <- step at the top of every enclosing program.
Composition with marginalize. When the prior is on a \(\mathcal{V}\text{-}\mathbf{Rel}\) morphism and the surrounding program's marginalize block integrates over a related coordinate, the marginalization composes with the prior in the standard way: \(\mathrm{marg}(\nu) = \pi_* \nu\) where \(\nu\) is the joint over \((W_f, \mathrm{rest})\) and \(\pi\) projects away \(W_f\) if the prior is the integration target, or leaves \(W_f\) in scope otherwise.
8. Equivariance and naturality¶
The constructions above are functorial in the chosen algebra (for the discrete stratum, Algebras §4) and in the choice of family registry (for the stochastic and continuous strata). In particular, any uniform substitution of one family for another that preserves the parameter-map signature induces a natural transformation of denotations; this is the formal underpinning of the family-registry lookup tables in quivers.continuous.families and quivers.stochastic.families.