Structural Compression

This page gives a denotational semantics for the four declarations that constitute the structural compression layer of QVR: signature, encoder, decoder, and loss. The user guide is Structural compression; this page is the formal companion.

1. Signatures

A signature declaration

signature Σ [Π] {
    sorts        { s_1 : k_1 [dim d_1] [vocab V_1], ... }
    constructors { c_1 : s_{i_1,1}, ..., s_{i_1,n_1} -> t_1, ... }
    binders      { β_1 : binds (x_1 : v_1 : a_1 : A_1, ...) in (b_1 : B_1, ...) -> r_1, ... }
    vertex_kinds { ... }
    edge_kinds   { ... }
}

denotes a generalized algebraic theory \(\mathrm{Th}_{\Sigma}\) together with its initial algebra \(T_{\Sigma}\).

1.1 Sorts

The sort declarations partition the sorts into three kinds:

sort_kind Role Carrier in \(T_{\Sigma}\)
object Principal sort, recursively compressed A set of well-typed terms
data Opaque atoms (vocab { … } may close the vocabulary) A finite set of leaf tokens
index De Bruijn position into a binder context \(\mathbb{N}\)

Let \(S = S_{\mathrm{obj}} \sqcup S_{\mathrm{dat}} \sqcup S_{\mathrm{idx}}\) be the disjoint union. The dim d annotation attaches a target dimension \(d_s \in \mathbb{N}\) to a sort; encoders / decoders may override it.

1.2 Contexts and terms

A syntactic de Bruijn context is a finite list

\[ \Gamma \;=\; \bigl((v_1, a_1), \dots, (v_k, a_k)\bigr), \]

where each \((v_i, a_i)\) is a pair of a variable sort \(v_i \in S\) and an annotation term \(a_i\) in some declared annotation sort. We write \(\Gamma\langle i\rangle = (v_i, a_i)\) for the \(i\)-th entry (0-indexed; \(i = 0\) is the most recently bound / innermost variable, matching the runtime Context convention). At the carrier level, the encoder threads alongside \(\Gamma\) an embedding context \(\varepsilon = (\varepsilon_0, \dots, \varepsilon_{k-1})\) with \(\varepsilon_i \in \mathbb{R}^{d_{v_i}}\); the embedding \(\varepsilon_i\) is the value minted by the encoder's var_init rule at the binding site (§2.2).

The closed terms of sort \(s\) in context \(\Gamma\), written \(\Gamma \vdash_\Sigma t : s\), are generated by the rules:

  1. Data leaf. If \(s \in S_{\mathrm{dat}}\) and \(d\) is a vocabulary element of \(s\) (or any value if \(s\) has no inline vocab), then \(\Gamma \vdash_\Sigma \mathrm{Data}_s(d) : s\).
  2. Bound variable. If \(\Gamma\langle i\rangle = (s, a)\) then \(\Gamma \vdash_\Sigma \mathrm{BoundVar}(i) : s\).
  3. Constructor. For each constructor \(c : s_1, \dots, s_n \to t \in K\), if \(\Gamma \vdash_\Sigma u_i : s_i\) for \(i = 1, \dots, n\), then \(\Gamma \vdash_\Sigma c(u_1, \dots, u_n) : t\).
  4. Binder. For each binder \(\beta : \mathrm{binds}\,(x_1 : v_1 : a_1 : A_1, \dots, x_p : v_p : a_p : A_p)\,\mathrm{in}\,(b_1 : B_1, \dots, b_q : B_q) \to r \in B\), if \(\Gamma \vdash_\Sigma \alpha_j : A_j\) for the annotations and \(\Gamma' \vdash_\Sigma \beta_j : B_j\) for the scoped arguments in the extended context \(\Gamma' = \Gamma \cdot (v_1, \alpha_1) \cdot \cdots \cdot (v_p, \alpha_p)\), then \(\Gamma \vdash_\Sigma \beta[\alpha_1, \dots, \alpha_p\,;\,\beta_1, \dots, \beta_q] : r\).

The initial \(\Sigma\)-algebra \(T_\Sigma\) is the sort-indexed family

\[ T_\Sigma \;=\; \bigl(T_\Sigma(\Gamma, s)\bigr)_{\Gamma,\, s}, \qquad T_\Sigma(\Gamma, s) = \{\, t \mid \Gamma \vdash_\Sigma t : s \,\}. \]

The reserved tags \(\mathrm{Data}\) and \(\mathrm{BoundVar}\) are not user-declarable; they are the framework-supplied leaf shapes.

1.3 Graph signatures

The graph-shaped fragment replaces (or augments) the inductive signature with declarations

vertex_kinds { K_1 : k_1 [dim d_1], ... }
edge_kinds   { ε_1 : K_a -> K_b, ε_2 : K_c -- K_d, ... }

A vertex kind is the type of a vertex; an edge kind is a typed binary relation \(\varepsilon \subseteq K_{\mathrm{src}} \times K_{\mathrm{tgt}}\), with -> directed and -- undirected. The corresponding initial structure is the category \(\mathbf{Graph}_\Sigma\) of finite typed graphs over the kind signature: objects of a fixed labeled multi-relational shape with payload data on each vertex. Encoders interpret graph signatures via message-passing rather than structural recursion (§2.4).

1.4 Parametric signatures

The optional [Π] parameter list introduces sort-level parameters \(\pi_1, \dots, \pi_m\) that may appear inside sort, constructor, or binder declarations of the signature body. A parametric signature denotes a family of theories indexed by substitutions \(\sigma\) assigning each \(\pi_i\) to a concrete sort declared in the surrounding module; each \(\sigma\) produces a concrete \(\mathrm{Th}_{\Sigma(\sigma)}\) with its own initial algebra \(T_{\Sigma(\sigma)}\).

2. Encoders

An encoder C over Σ declaration

encoder C over Σ [σ_args] {
    dim s = d_s , ...
    [iterations n]
    [readout |-> body_R]
    op_1(args_1)   [recurrent state | attention prefix] |-> body_1
    ...
    init κ(arg)    |-> body_init_κ                          # graph
    message[ε](s,t) |-> body_msg_ε                          # graph
    update[K](v,m)  |-> body_upd_K                          # graph
    var_init v_i [from A_i [as ty]] |-> body_var_i
}

or the factory form encoder C over Σ using F [options], denotes a \(\Sigma\)-algebra homomorphism from the initial algebra into a fixed-dimension vector carrier algebra.

2.1 The carrier algebra

The carrier is the sort-indexed product of Euclidean spaces

\[ \mathcal{A}_C \;=\; \bigl(\mathbb{R}^{d_s}\bigr)_{s \in S_{\mathrm{obj}} \cup S_{\mathrm{dat}}}, \]

with the dimension \(d_s\) taken from the encoder's dim s = d_s clauses (defaulting to the signature's inline dim annotation when present, raising a compile-time error otherwise). Index sorts have no embedding; their semantics is purely positional.

The carrier is equipped with one operation per constructor and binder of \(\Sigma\), each interpreted as a measurable map between the appropriate sort dimensions:

  • \(\widehat c : \mathbb{R}^{d_{s_1}} \times \cdots \times \mathbb{R}^{d_{s_n}} \to \mathbb{R}^{d_t}\) for each constructor \(c : s_1, \dots, s_n \to t\), interpreted by the let-expression \(\mathrm{body}_c\);
  • \(\widehat \beta : \mathbb{R}^{d_{A_1}} \times \cdots \times \mathbb{R}^{d_{A_p}} \times \mathbb{R}^{d_{B_1}} \times \cdots \times \mathbb{R}^{d_{B_q}} \to \mathbb{R}^{d_r}\) for each binder, interpreted by \(\mathrm{body}_\beta\).

The variable-initializer declarations

var_init v from A as ty |-> body

specify, for each pair \((v, A)\) of a variable sort and an annotation sort the signature's binders introduce, a map

\[ \mathrm{init}_{v, A} : \mathbb{R}^{d_A} \to \mathbb{R}^{d_v}, \]

minting the bound variable's embedding from the embedding of its annotation. Unannotated binders default to a learnable nullary constant per variable sort.

2.2 Denotation: structural recursion

The encoder denotes the unique \(\Sigma\)-algebra homomorphism

\[ \llbracket C \rrbracket \;:\; T_\Sigma \;\longrightarrow\; \mathcal{A}_C \]

obtained by initial-algebra recursion threaded with an embedding context. For each sorted term \(\Gamma \vdash_\Sigma t : s\) and each embedding context \(\varepsilon\) of length \(|\Gamma|\) with \(\varepsilon_i \in \mathbb{R}^{d_{v_i}}\), the denotation \(\llbracket C \rrbracket(t \mid \Gamma, \varepsilon) \in \mathbb{R}^{d_s}\) is defined by structural recursion on \(t\):

\[ \begin{array}{rcl} \llbracket C \rrbracket(\mathrm{Data}_s(d) \mid \Gamma, \varepsilon) & = & \mathrm{embed}_s(d), \\[2pt] \llbracket C \rrbracket(\mathrm{BoundVar}(i) \mid \Gamma, \varepsilon) & = & \varepsilon_i, \\[2pt] \llbracket C \rrbracket(c(u_1, \dots, u_n) \mid \Gamma, \varepsilon) & = & \widehat c\bigl(\llbracket C \rrbracket(u_1 \mid \Gamma, \varepsilon),\, \dots,\, \llbracket C \rrbracket(u_n \mid \Gamma, \varepsilon)\bigr), \\[2pt] \llbracket C \rrbracket\bigl(\beta[\alpha_1, \dots; \beta_1, \dots] \mid \Gamma, \varepsilon\bigr) & = & \widehat \beta\Bigl(\llbracket C \rrbracket(\alpha_j \mid \Gamma, \varepsilon)_j,\; \llbracket C \rrbracket(\beta_k \mid \Gamma', \varepsilon')_k\Bigr). \end{array} \]

In the binder case the extended contexts \((\Gamma', \varepsilon')\) are

\[ \Gamma' \;=\; \Gamma \cdot (v_1, \alpha_1) \cdot \dots \cdot (v_p, \alpha_p), \qquad \varepsilon' \;=\; \varepsilon \cdot \mathrm{init}_{v_1, A_1}\bigl(\llbracket C \rrbracket(\alpha_1 \mid \Gamma, \varepsilon)\bigr) \cdot \dots \cdot \mathrm{init}_{v_p, A_p}\bigl(\llbracket C \rrbracket(\alpha_p \mid \Gamma, \varepsilon)\bigr), \]

so the scoped arguments \(\beta_k\) are encoded with the bound variables' freshly-minted embeddings in scope. The closed-term denotation is the empty-context evaluation \(\llbracket C \rrbracket(t) = \llbracket C \rrbracket(t \mid \varnothing, \varnothing)\).

The map \(\mathrm{embed}_s : \mathrm{Vocab}_s \to \mathbb{R}^{d_s}\) is supplied at encoder construction. The compiler scaffolds a default for each data sort: a per-key learnable parameter dict that lazily allocates a fresh \(\mathbb{R}^{d_s}\) vector the first time each leaf value is seen, which suffices for both closed and open vocabularies.

This is the standard catamorphism (initial-algebra fold) of an \(F\)-algebra (Lambek's lemma); the existence and uniqueness of \(\llbracket C \rrbracket\) are by induction on the term structure.

2.3 Sequence-shaped sugar

The recurrent state and attention prefix modes on a constructor rule are syntactic sugar that fix the recursion shape on a Cons-like binary constructor whose second argument is the recursive child.

  • Recurrent. The rule \(\mathrm{Cons}(h, t)\,\mathrm{recurrent}\,\mathrm{state} \mapsto \mathrm{body}(h,\,\mathrm{state})\) denotes the right-fold of the constructor body over the list:

$$ \llbracket C \rrbracket(\mathrm{Cons}(h_1, \mathrm{Cons}(h_2, \dots, \mathrm{Cons}(h_n, \mathrm{Nil})))) \;=\; \mathrm{body}\bigl(h_1,\,\mathrm{body}(h_2,\, \dots\,\mathrm{body}(h_n,\,\llbracket C \rrbracket(\mathrm{Nil}))\dots)\bigr), $$

with state bound at each step to the recursive child's already-computed embedding. The outermost body call sees \(h_1\) (the head); the innermost sees \(h_n\) together with the encoded \(\mathrm{Nil}\).

  • Attention. The rule \(\mathrm{Cons}(h, t)\,\mathrm{attention}\,\mathrm{prefix} \mapsto \mathrm{body}(h, \mathrm{prefix})\) denotes the outside-in walk

$$ \llbracket C \rrbracket(\mathrm{Cons}(h_1, \mathrm{Cons}(h_2, \dots))) \;=\; \mathrm{body}\bigl(h_k,\, [\llbracket C \rrbracket(h_1),\, \dots,\, \llbracket C \rrbracket(h_{k-1})]\bigr), $$

where prefix is the running list of prior non-recursive children's embeddings collected outside-in. The encoder's final embedding is the innermost step's output.

Both modes are derivable from the plain F-algebra rule by a choice of evaluation strategy; the sugar fixes which strategy the runtime uses.

2.4 Graph encoders

For a graph signature, the encoder iterates message-passing for \(T = \mathrm{iterations}\) rounds:

\[ \begin{array}{rcll} v_\kappa^{(0)} & = & \mathrm{init}_\kappa(\mathrm{payload}_\kappa) & \text{per vertex of kind } \kappa, \\[2pt] m_\varepsilon^{(t)} & = & \mathrm{message}_\varepsilon\bigl(v_{\mathrm{src}}^{(t)},\, v_{\mathrm{tgt}}^{(t)}\bigr) & \text{per edge of kind } \varepsilon, \\[2pt] v_\kappa^{(t+1)} & = & \mathrm{update}_\kappa\bigl(v_\kappa^{(t)},\, \mathrm{agg}\{m_\varepsilon^{(t)} : \varepsilon \ni v_\kappa\}\bigr) & \\[2pt] \llbracket C \rrbracket(G) & = & \mathrm{readout}(v_\ast^{(T)}). & \end{array} \]

Undirected edges (--) emit messages in both directions; directed edges (->) only in the declared direction. The denotation is invariant under graph isomorphism and equivariant under the permutation action on vertices (Battaglia et al., 2018, §4); the readout aggregates the final vertex-state field into a graph-level vector.

2.5 Factory form

encoder C over Σ using F [options] denotes the encoder produced by the registry entry \(F\) in quivers.structural.shapes applied to the signature and the keyword overrides. Each registry entry \(F\) is a builder function (Σ, **options) ↦ Encoder returning an Encoder directly. Both the explicit form and the factory form denote a \(\Sigma\)-algebra homomorphism \(T_\Sigma \to \mathcal{A}_C\); the two forms share the same denotation when the factory's builder produces a per-op function family equivalent to the explicit body.

3. Decoders

A decoder D over Σ depth d declaration denotes, for each object sort \(s \in S_{\mathrm{obj}}\) and each syntactic de Bruijn context \(\Gamma\) (§1.2), a Markov kernel

\[ \llbracket D \rrbracket_{s, \Gamma} \;:\; \mathbb{R}^{d_s} \;\longrightarrow\; \mathcal{G}\bigl(T_\Sigma(\Gamma, s)\bigr), \]

producing for each carrier vector a probability distribution over closed signature terms of sort \(s\) in context \(\Gamma\). The kernel is defined by structural corecursion on the term shape: it is the unfold of an \(F_\Sigma\)-coalgebra in the Kleisli category of the Giry monad, where \(F_\Sigma\) is the polynomial endofunctor

\[ F_\Sigma(X)(s) \;=\; \coprod_{\substack{c\,:\,s_1, \dots, s_n \to s\\ \text{constructors of } s}} \prod_{i=1}^n X(s_i)\;\;\sqcup\;\; \coprod_{\substack{\beta \to s\\ \text{binders of } s}} (\text{annot} \times \text{scoped})\;\;\sqcup\;\;\mathrm{BoundVar}_s, \]

associated to the signature \(\Sigma\) (data leaves and bound variables are the constant summands). The depth bound \(d\) caps the corecursion: every term in the support of \(\llbracket D \rrbracket_s(v, \Gamma)\) has tree depth \(\le d\) and consequently inhabits the (finite) initial algebra \(T_\Sigma\).

3.1 The four parametric heads

A decoder body declares four per-sort logits-producing maps (each is a let-expression in the carrier vector):

Head Signature Role
\(\mathrm{structure}_s(v)\) $\mathbb{R}^{d_s} \to \Delta^{ K_s
\(\mathrm{primitive}_s(v)\) $\mathbb{R}^{d_s} \to \Delta^{ \mathrm{Vocab}_s
\(\mathrm{factor}_s(v)\) \(\mathbb{R}^{d_s} \to \prod_i \mathbb{R}^{d_{s_i}}\) Factorisation of the parent embedding into per-child embeddings; the per-constructor arity / dim sequence is selected by the chosen production
\(\mathrm{binder\_select}_s(v)\) $\mathbb{R}^{d_s} \to \Delta^{ \Gamma_s

The required body |-> recursive clause selects the framework's structural corecursion as the body shape; recursive is the sole legal body keyword.

3.2 Denotation: structural corecursion

For carrier vector \(v \in \mathbb{R}^{d_s}\), context \(\Gamma\), and depth budget \(d\), the kernel \(\llbracket D \rrbracket(v, \Gamma, d)\) samples a term \(t \in T_\Sigma(\Gamma, s)\) by the procedure:

  1. If \(s \in S_{\mathrm{dat}}\), sample \(d^\star \sim \mathrm{primitive}(v)\) and return \(\mathrm{Data}_s(d^\star)\).
  2. Otherwise, sample a choice \(k \sim \mathrm{structure}(v)\):
  3. If \(k = \mathrm{BoundVar}\) and there is an in-scope variable of sort \(s\), sample \(i \sim \mathrm{binder\_select}(v)\) over the matching positions of \(\Gamma\) and return \(\mathrm{BoundVar}(i)\).
  4. If \(k\) is a constructor \(c : s_1, \dots, s_n \to s\), compute \((v_1, \dots, v_n) = \mathrm{factor}(v)\) and (recursively at depth \(d - 1\)) sample \(u_i \sim \llbracket D \rrbracket(v_i, \Gamma, d - 1)\) for each \(i\); return \(c(u_1, \dots, u_n)\).
  5. If \(k\) is a binder \(\beta\), compute the annotation embeddings via \(\mathrm{factor}\), sample the annotation terms, extend the context, then sample the scoped children in the extended context. Return \(\beta[\bar\alpha; \bar \beta]\).
  6. When \(d = 0\), restrict the support of \(\mathrm{structure}(v)\) to the set of terminating productions: constructors / binders whose every child sort is in \(S_{\mathrm{dat}} \cup S_{\mathrm{idx}}\), together with \(\mathrm{BoundVar}\) (always terminating, available when \(\Gamma_s\) is non-empty). If no terminating production exists at the current sort, the runtime raises a typed error.

This is a depth-bounded corecursion in the Kleisli category of \(\mathcal{G}\) over the initial algebra \(T_\Sigma\). The associated score kernel

\[ \mathrm{score}_D \;:\; T_\Sigma \times \mathbb{R}^{d_s} \times \mathrm{Ctx} \to \mathbb{R}_{\le 0}, \qquad \mathrm{score}_D(t, v, \Gamma) \;=\; \log P\bigl[t \mid \llbracket D \rrbracket(v, \Gamma)\bigr], \]

is the log-probability of producing the observed term \(t\) under the same factorized distribution; it is computed by the recursion above with every "sample" replaced by "score against the observed branch."

3.3 Round-trip with the encoder

For a paired encoder / decoder over the same signature, the composite

\[ \llbracket D \rrbracket \circ \llbracket C \rrbracket \;:\; T_\Sigma \to \mathcal{G}(T_\Sigma) \]

is the reconstruction kernel. The reconstruction loss (§4.2 below) trains \(C\) and \(D\) jointly so that \(\llbracket D \rrbracket(\llbracket C \rrbracket(t))\) concentrates near \(\delta_t\). There is no general round-trip law: the pair \((C, D)\) does not form a section / retraction in \(\mathbf{Kern}\) unless the encoder is invertible, which finite-dim encoders generically are not.

4. Losses

A loss L weight w on K T { body } declaration denotes a scalar functional on a training site.

4.1 Attachment sites

The optional on clause selects an attachment site; absent, the loss is global. The attachment kinds and their associated trace types are:

Attachment Site Trace passed to body
global once per training step the global environment \(\eta\)
program P each invocation of program \(P\) the program's sample / score trace \(\tau_P\)
deduction Δ each call to deduction \(\Delta\) the chart \(\alpha_\Delta : I \to K\)
encoder C each call to \(C\) the pair \((t, \llbracket C \rrbracket(t))\)
decoder D each call to \(D\) the pair \((v, t)\) with \(t \sim \llbracket D \rrbracket(v)\)
rule R in Δ every application of \(R\) during \(\Delta\)'s chart build the premise / conclusion items at the firing
chart of Δ once on \(\Delta\)'s completed chart the chart \(\alpha_\Delta\)

4.2 Denotation

For a loss declaration \(L\) with weight \(w\) (a let-expression in the trace), attachment kind \(\kappa\), target \(T\), and body \(\mathrm{body}_L\), the denotation is the functional

\[ \llbracket L \rrbracket \;:\; \mathrm{Trace}_{\kappa, T} \to \mathbb{R}, \qquad \llbracket L \rrbracket(\tau) \;=\; \llbracket w \rrbracket(\tau)\, \cdot\, \llbracket \mathrm{body}_L \rrbracket(\tau). \]

The total loss of a module is the formal sum over every fire of every declared loss in a training step:

\[ \llbracket \mathrm{Total} \rrbracket(\eta) \;=\; \sum_{L,\,\tau \in \mathrm{Fires}(L,\,\eta)} \llbracket L \rrbracket(\tau). \]

Loss bodies are arbitrary let-expressions over the trace; canonical shapes include -D(C(input)).log_prob(input) (reconstruction), cross_entropy(p, q) (consistency), -chart.goal_weight() (parsing likelihood), and bce_with_logits(pred, target) (supervised prediction).

4.3 Composition with the inference layer

The total loss is the objective passed to the optimizer by the inference driver. It is not part of the kernel-denotation of any program: a program P : τ_in → τ_out denotes a Markov kernel, and losses attached to \(P\) are functionals on its trace, not on its denotation. See Programs §6 for the role of conditioning and the inference layer.

5. References