Structural Compression¶
This page gives a denotational semantics for the four declarations that constitute the structural compression layer of QVR: signature, encoder, decoder, and loss. The user guide is Structural compression; this page is the formal companion.
1. Signatures¶
A signature declaration
signature Σ [Π] {
sorts { s_1 : k_1 [dim d_1] [vocab V_1], ... }
constructors { c_1 : s_{i_1,1}, ..., s_{i_1,n_1} -> t_1, ... }
binders { β_1 : binds (x_1 : v_1 : a_1 : A_1, ...) in (b_1 : B_1, ...) -> r_1, ... }
vertex_kinds { ... }
edge_kinds { ... }
}
denotes a generalized algebraic theory \(\mathrm{Th}_{\Sigma}\) together with its initial algebra \(T_{\Sigma}\).
1.1 Sorts¶
The sort declarations partition the sorts into three kinds:
sort_kind |
Role | Carrier in \(T_{\Sigma}\) |
|---|---|---|
object |
Principal sort, recursively compressed | A set of well-typed terms |
data |
Opaque atoms (vocab { … } may close the vocabulary) |
A finite set of leaf tokens |
index |
De Bruijn position into a binder context | \(\mathbb{N}\) |
Let \(S = S_{\mathrm{obj}} \sqcup S_{\mathrm{dat}} \sqcup S_{\mathrm{idx}}\) be the disjoint union. The dim d annotation attaches a target dimension \(d_s \in \mathbb{N}\) to a sort; encoders / decoders may override it.
1.2 Contexts and terms¶
A syntactic de Bruijn context is a finite list
where each \((v_i, a_i)\) is a pair of a variable sort \(v_i \in S\) and an annotation term \(a_i\) in some declared annotation sort. We write \(\Gamma\langle i\rangle = (v_i, a_i)\) for the \(i\)-th entry (0-indexed; \(i = 0\) is the most recently bound / innermost variable, matching the runtime Context convention). At the carrier level, the encoder threads alongside \(\Gamma\) an embedding context \(\varepsilon = (\varepsilon_0, \dots, \varepsilon_{k-1})\) with \(\varepsilon_i \in \mathbb{R}^{d_{v_i}}\); the embedding \(\varepsilon_i\) is the value minted by the encoder's var_init rule at the binding site (§2.2).
The closed terms of sort \(s\) in context \(\Gamma\), written \(\Gamma \vdash_\Sigma t : s\), are generated by the rules:
- Data leaf. If \(s \in S_{\mathrm{dat}}\) and \(d\) is a vocabulary element of \(s\) (or any value if \(s\) has no inline vocab), then \(\Gamma \vdash_\Sigma \mathrm{Data}_s(d) : s\).
- Bound variable. If \(\Gamma\langle i\rangle = (s, a)\) then \(\Gamma \vdash_\Sigma \mathrm{BoundVar}(i) : s\).
- Constructor. For each constructor \(c : s_1, \dots, s_n \to t \in K\), if \(\Gamma \vdash_\Sigma u_i : s_i\) for \(i = 1, \dots, n\), then \(\Gamma \vdash_\Sigma c(u_1, \dots, u_n) : t\).
- Binder. For each binder \(\beta : \mathrm{binds}\,(x_1 : v_1 : a_1 : A_1, \dots, x_p : v_p : a_p : A_p)\,\mathrm{in}\,(b_1 : B_1, \dots, b_q : B_q) \to r \in B\), if \(\Gamma \vdash_\Sigma \alpha_j : A_j\) for the annotations and \(\Gamma' \vdash_\Sigma \beta_j : B_j\) for the scoped arguments in the extended context \(\Gamma' = \Gamma \cdot (v_1, \alpha_1) \cdot \cdots \cdot (v_p, \alpha_p)\), then \(\Gamma \vdash_\Sigma \beta[\alpha_1, \dots, \alpha_p\,;\,\beta_1, \dots, \beta_q] : r\).
The initial \(\Sigma\)-algebra \(T_\Sigma\) is the sort-indexed family
The reserved tags \(\mathrm{Data}\) and \(\mathrm{BoundVar}\) are not user-declarable; they are the framework-supplied leaf shapes.
1.3 Graph signatures¶
The graph-shaped fragment replaces (or augments) the inductive signature with declarations
vertex_kinds { K_1 : k_1 [dim d_1], ... }
edge_kinds { ε_1 : K_a -> K_b, ε_2 : K_c -- K_d, ... }
A vertex kind is the type of a vertex; an edge kind is a typed binary relation \(\varepsilon \subseteq K_{\mathrm{src}} \times K_{\mathrm{tgt}}\), with -> directed and -- undirected. The corresponding initial structure is the category \(\mathbf{Graph}_\Sigma\) of finite typed graphs over the kind signature: objects of a fixed labeled multi-relational shape with payload data on each vertex. Encoders interpret graph signatures via message-passing rather than structural recursion (§2.4).
1.4 Parametric signatures¶
The optional [Π] parameter list introduces sort-level parameters \(\pi_1, \dots, \pi_m\) that may appear inside sort, constructor, or binder declarations of the signature body. A parametric signature denotes a family of theories indexed by substitutions \(\sigma\) assigning each \(\pi_i\) to a concrete sort declared in the surrounding module; each \(\sigma\) produces a concrete \(\mathrm{Th}_{\Sigma(\sigma)}\) with its own initial algebra \(T_{\Sigma(\sigma)}\).
2. Encoders¶
An encoder C over Σ declaration
encoder C over Σ [σ_args] {
dim s = d_s , ...
[iterations n]
[readout |-> body_R]
op_1(args_1) [recurrent state | attention prefix] |-> body_1
...
init κ(arg) |-> body_init_κ # graph
message[ε](s,t) |-> body_msg_ε # graph
update[K](v,m) |-> body_upd_K # graph
var_init v_i [from A_i [as ty]] |-> body_var_i
}
or the factory form encoder C over Σ using F [options], denotes a \(\Sigma\)-algebra homomorphism from the initial algebra into a fixed-dimension vector carrier algebra.
2.1 The carrier algebra¶
The carrier is the sort-indexed product of Euclidean spaces
with the dimension \(d_s\) taken from the encoder's dim s = d_s clauses (defaulting to the signature's inline dim annotation when present, raising a compile-time error otherwise). Index sorts have no embedding; their semantics is purely positional.
The carrier is equipped with one operation per constructor and binder of \(\Sigma\), each interpreted as a measurable map between the appropriate sort dimensions:
- \(\widehat c : \mathbb{R}^{d_{s_1}} \times \cdots \times \mathbb{R}^{d_{s_n}} \to \mathbb{R}^{d_t}\) for each constructor \(c : s_1, \dots, s_n \to t\), interpreted by the let-expression \(\mathrm{body}_c\);
- \(\widehat \beta : \mathbb{R}^{d_{A_1}} \times \cdots \times \mathbb{R}^{d_{A_p}} \times \mathbb{R}^{d_{B_1}} \times \cdots \times \mathbb{R}^{d_{B_q}} \to \mathbb{R}^{d_r}\) for each binder, interpreted by \(\mathrm{body}_\beta\).
The variable-initializer declarations
var_init v from A as ty |-> body
specify, for each pair \((v, A)\) of a variable sort and an annotation sort the signature's binders introduce, a map
minting the bound variable's embedding from the embedding of its annotation. Unannotated binders default to a learnable nullary constant per variable sort.
2.2 Denotation: structural recursion¶
The encoder denotes the unique \(\Sigma\)-algebra homomorphism
obtained by initial-algebra recursion threaded with an embedding context. For each sorted term \(\Gamma \vdash_\Sigma t : s\) and each embedding context \(\varepsilon\) of length \(|\Gamma|\) with \(\varepsilon_i \in \mathbb{R}^{d_{v_i}}\), the denotation \(\llbracket C \rrbracket(t \mid \Gamma, \varepsilon) \in \mathbb{R}^{d_s}\) is defined by structural recursion on \(t\):
In the binder case the extended contexts \((\Gamma', \varepsilon')\) are
so the scoped arguments \(\beta_k\) are encoded with the bound variables' freshly-minted embeddings in scope. The closed-term denotation is the empty-context evaluation \(\llbracket C \rrbracket(t) = \llbracket C \rrbracket(t \mid \varnothing, \varnothing)\).
The map \(\mathrm{embed}_s : \mathrm{Vocab}_s \to \mathbb{R}^{d_s}\) is supplied at encoder construction. The compiler scaffolds a default for each data sort: a per-key learnable parameter dict that lazily allocates a fresh \(\mathbb{R}^{d_s}\) vector the first time each leaf value is seen, which suffices for both closed and open vocabularies.
This is the standard catamorphism (initial-algebra fold) of an \(F\)-algebra (Lambek's lemma); the existence and uniqueness of \(\llbracket C \rrbracket\) are by induction on the term structure.
2.3 Sequence-shaped sugar¶
The recurrent state and attention prefix modes on a constructor rule are syntactic sugar that fix the recursion shape on a Cons-like binary constructor whose second argument is the recursive child.
- Recurrent. The rule \(\mathrm{Cons}(h, t)\,\mathrm{recurrent}\,\mathrm{state} \mapsto \mathrm{body}(h,\,\mathrm{state})\) denotes the right-fold of the constructor body over the list:
$$ \llbracket C \rrbracket(\mathrm{Cons}(h_1, \mathrm{Cons}(h_2, \dots, \mathrm{Cons}(h_n, \mathrm{Nil})))) \;=\; \mathrm{body}\bigl(h_1,\,\mathrm{body}(h_2,\, \dots\,\mathrm{body}(h_n,\,\llbracket C \rrbracket(\mathrm{Nil}))\dots)\bigr), $$
with state bound at each step to the recursive child's already-computed embedding. The outermost body call sees \(h_1\) (the head); the innermost sees \(h_n\) together with the encoded \(\mathrm{Nil}\).
- Attention. The rule \(\mathrm{Cons}(h, t)\,\mathrm{attention}\,\mathrm{prefix} \mapsto \mathrm{body}(h, \mathrm{prefix})\) denotes the outside-in walk
$$ \llbracket C \rrbracket(\mathrm{Cons}(h_1, \mathrm{Cons}(h_2, \dots))) \;=\; \mathrm{body}\bigl(h_k,\, [\llbracket C \rrbracket(h_1),\, \dots,\, \llbracket C \rrbracket(h_{k-1})]\bigr), $$
where prefix is the running list of prior non-recursive children's embeddings collected outside-in. The encoder's final embedding is the innermost step's output.
Both modes are derivable from the plain F-algebra rule by a choice of evaluation strategy; the sugar fixes which strategy the runtime uses.
2.4 Graph encoders¶
For a graph signature, the encoder iterates message-passing for \(T = \mathrm{iterations}\) rounds:
Undirected edges (--) emit messages in both directions; directed edges (->) only in the declared direction. The denotation is invariant under graph isomorphism and equivariant under the permutation action on vertices (Battaglia et al., 2018, §4); the readout aggregates the final vertex-state field into a graph-level vector.
2.5 Factory form¶
encoder C over Σ using F [options] denotes the encoder produced by the registry entry \(F\) in quivers.structural.shapes applied to the signature and the keyword overrides. Each registry entry \(F\) is a builder function (Σ, **options) ↦ Encoder returning an Encoder directly. Both the explicit form and the factory form denote a \(\Sigma\)-algebra homomorphism \(T_\Sigma \to \mathcal{A}_C\); the two forms share the same denotation when the factory's builder produces a per-op function family equivalent to the explicit body.
3. Decoders¶
A decoder D over Σ depth d declaration denotes, for each object sort \(s \in S_{\mathrm{obj}}\) and each syntactic de Bruijn context \(\Gamma\) (§1.2), a Markov kernel
producing for each carrier vector a probability distribution over closed signature terms of sort \(s\) in context \(\Gamma\). The kernel is defined by structural corecursion on the term shape: it is the unfold of an \(F_\Sigma\)-coalgebra in the Kleisli category of the Giry monad, where \(F_\Sigma\) is the polynomial endofunctor
associated to the signature \(\Sigma\) (data leaves and bound variables are the constant summands). The depth bound \(d\) caps the corecursion: every term in the support of \(\llbracket D \rrbracket_s(v, \Gamma)\) has tree depth \(\le d\) and consequently inhabits the (finite) initial algebra \(T_\Sigma\).
3.1 The four parametric heads¶
A decoder body declares four per-sort logits-producing maps (each is a let-expression in the carrier vector):
| Head | Signature | Role |
|---|---|---|
| \(\mathrm{structure}_s(v)\) | $\mathbb{R}^{d_s} \to \Delta^{ | K_s |
| \(\mathrm{primitive}_s(v)\) | $\mathbb{R}^{d_s} \to \Delta^{ | \mathrm{Vocab}_s |
| \(\mathrm{factor}_s(v)\) | \(\mathbb{R}^{d_s} \to \prod_i \mathbb{R}^{d_{s_i}}\) | Factorisation of the parent embedding into per-child embeddings; the per-constructor arity / dim sequence is selected by the chosen production |
| \(\mathrm{binder\_select}_s(v)\) | $\mathbb{R}^{d_s} \to \Delta^{ | \Gamma_s |
The required body |-> recursive clause selects the framework's structural corecursion as the body shape; recursive is the sole legal body keyword.
3.2 Denotation: structural corecursion¶
For carrier vector \(v \in \mathbb{R}^{d_s}\), context \(\Gamma\), and depth budget \(d\), the kernel \(\llbracket D \rrbracket(v, \Gamma, d)\) samples a term \(t \in T_\Sigma(\Gamma, s)\) by the procedure:
- If \(s \in S_{\mathrm{dat}}\), sample \(d^\star \sim \mathrm{primitive}(v)\) and return \(\mathrm{Data}_s(d^\star)\).
- Otherwise, sample a choice \(k \sim \mathrm{structure}(v)\):
- If \(k = \mathrm{BoundVar}\) and there is an in-scope variable of sort \(s\), sample \(i \sim \mathrm{binder\_select}(v)\) over the matching positions of \(\Gamma\) and return \(\mathrm{BoundVar}(i)\).
- If \(k\) is a constructor \(c : s_1, \dots, s_n \to s\), compute \((v_1, \dots, v_n) = \mathrm{factor}(v)\) and (recursively at depth \(d - 1\)) sample \(u_i \sim \llbracket D \rrbracket(v_i, \Gamma, d - 1)\) for each \(i\); return \(c(u_1, \dots, u_n)\).
- If \(k\) is a binder \(\beta\), compute the annotation embeddings via \(\mathrm{factor}\), sample the annotation terms, extend the context, then sample the scoped children in the extended context. Return \(\beta[\bar\alpha; \bar \beta]\).
- When \(d = 0\), restrict the support of \(\mathrm{structure}(v)\) to the set of terminating productions: constructors / binders whose every child sort is in \(S_{\mathrm{dat}} \cup S_{\mathrm{idx}}\), together with \(\mathrm{BoundVar}\) (always terminating, available when \(\Gamma_s\) is non-empty). If no terminating production exists at the current sort, the runtime raises a typed error.
This is a depth-bounded corecursion in the Kleisli category of \(\mathcal{G}\) over the initial algebra \(T_\Sigma\). The associated score kernel
is the log-probability of producing the observed term \(t\) under the same factorized distribution; it is computed by the recursion above with every "sample" replaced by "score against the observed branch."
3.3 Round-trip with the encoder¶
For a paired encoder / decoder over the same signature, the composite
is the reconstruction kernel. The reconstruction loss (§4.2 below) trains \(C\) and \(D\) jointly so that \(\llbracket D \rrbracket(\llbracket C \rrbracket(t))\) concentrates near \(\delta_t\). There is no general round-trip law: the pair \((C, D)\) does not form a section / retraction in \(\mathbf{Kern}\) unless the encoder is invertible, which finite-dim encoders generically are not.
4. Losses¶
A loss L weight w on K T { body } declaration denotes a scalar functional on a training site.
4.1 Attachment sites¶
The optional on clause selects an attachment site; absent, the loss is global. The attachment kinds and their associated trace types are:
| Attachment | Site | Trace passed to body |
|---|---|---|
global |
once per training step | the global environment \(\eta\) |
program P |
each invocation of program \(P\) | the program's sample / score trace \(\tau_P\) |
deduction Δ |
each call to deduction \(\Delta\) | the chart \(\alpha_\Delta : I \to K\) |
encoder C |
each call to \(C\) | the pair \((t, \llbracket C \rrbracket(t))\) |
decoder D |
each call to \(D\) | the pair \((v, t)\) with \(t \sim \llbracket D \rrbracket(v)\) |
rule R in Δ |
every application of \(R\) during \(\Delta\)'s chart build | the premise / conclusion items at the firing |
chart of Δ |
once on \(\Delta\)'s completed chart | the chart \(\alpha_\Delta\) |
4.2 Denotation¶
For a loss declaration \(L\) with weight \(w\) (a let-expression in the trace), attachment kind \(\kappa\), target \(T\), and body \(\mathrm{body}_L\), the denotation is the functional
The total loss of a module is the formal sum over every fire of every declared loss in a training step:
Loss bodies are arbitrary let-expressions over the trace; canonical shapes include -D(C(input)).log_prob(input) (reconstruction), cross_entropy(p, q) (consistency), -chart.goal_weight() (parsing likelihood), and bce_with_logits(pred, target) (supervised prediction).
4.3 Composition with the inference layer¶
The total loss is the objective passed to the optimizer by the inference driver. It is not part of the kernel-denotation of any program: a program P : τ_in → τ_out denotes a Markov kernel, and losses attached to \(P\) are functionals on its trace, not on its denotation. See Programs §6 for the role of conditioning and the inference layer.
5. References¶
- Lambek, J. (1968). A fixpoint theorem for complete categories. Mathematische Zeitschrift 103(2), 151–161.
- Cartmell, J. (1986). Generalised algebraic theories and contextual categories. Annals of Pure and Applied Logic 32, 209–243.
- Battaglia, P. W. et al. (2018). Relational inductive biases, deep learning, and graph networks. arXiv:1806.01261.
- Hutton, G. (1999). A tutorial on the universality and expressiveness of fold. Journal of Functional Programming 9(4), 355–372.