Structural Compression¶

This page gives a denotational semantics for the four declarations that constitute the structural compression layer of QVR: signature, encoder, decoder, and loss. The user guide is Structural compression; this page is the formal companion.

1. Signatures¶

A signature declaration

signature Σ [Π] {
    sorts        { s_1 : k_1 [dim d_1] [vocab V_1], ... }
    constructors { c_1 : s_{i_1,1}, ..., s_{i_1,n_1} -> t_1, ... }
    binders      { β_1 : binds (x_1 : v_1 : a_1 : A_1, ...) in (b_1 : B_1, ...) -> r_1, ... }
    vertex_kinds { ... }
    edge_kinds   { ... }
}

denotes a generalized algebraic theory $\mathrm{Th}_{\Sigma}$ together with its initial algebra $T_{\Sigma}$.

1.1 Sorts¶

The sort declarations partition the sorts into three kinds:

`sort_kind`	Role	Carrier in $T_{\Sigma}$
`object`	Principal sort, recursively compressed	A set of well-typed terms
`data`	Opaque atoms (`vocab { … }` may close the vocabulary)	A finite set of leaf tokens
`index`	De Bruijn position into a binder context	$\mathbb{N}$

Let $S = S_{\mathrm{obj}} \sqcup S_{\mathrm{dat}} \sqcup S_{\mathrm{idx}}$ be the disjoint union. The dim d annotation attaches a target dimension $d_s \in \mathbb{N}$ to a sort; encoders / decoders may override it.

1.2 Contexts and terms¶

A syntactic de Bruijn context is a finite list

\[ \Gamma \;=\; \bigl((v_1, a_1), \dots, (v_k, a_k)\bigr), \]

where each $(v_i, a_i)$ is a pair of a variable sort $v_i \in S$ and an annotation term $a_i$ in some declared annotation sort. We write $\Gamma\langle i\rangle = (v_i, a_i)$ for the $i$-th entry (0-indexed; $i = 0$ is the most recently bound / innermost variable, matching the runtime Context convention). At the carrier level, the encoder threads alongside $\Gamma$ an embedding context $\varepsilon = (\varepsilon_0, \dots, \varepsilon_{k-1})$ with $\varepsilon_i \in \mathbb{R}^{d_{v_i}}$; the embedding $\varepsilon_i$ is the value minted by the encoder's var_init rule at the binding site (§2.2).

The closed terms of sort $s$ in context $\Gamma$, written $\Gamma \vdash_\Sigma t : s$, are generated by the rules:

Data leaf. If $s \in S_{\mathrm{dat}}$ and $d$ is a vocabulary element of $s$ (or any value if $s$ has no inline vocab), then $\Gamma \vdash_\Sigma \mathrm{Data}_s(d) : s$.
Bound variable. If $\Gamma\langle i\rangle = (s, a)$ then $\Gamma \vdash_\Sigma \mathrm{BoundVar}(i) : s$.
Constructor. For each constructor $c : s_1, \dots, s_n \to t \in K$, if $\Gamma \vdash_\Sigma u_i : s_i$ for $i = 1, \dots, n$, then $\Gamma \vdash_\Sigma c(u_1, \dots, u_n) : t$.
Binder. For each binder $\beta : \mathrm{binds}\,(x_1 : v_1 : a_1 : A_1, \dots, x_p : v_p : a_p : A_p)\,\mathrm{in}\,(b_1 : B_1, \dots, b_q : B_q) \to r \in B$, if $\Gamma \vdash_\Sigma \alpha_j : A_j$ for the annotations and $\Gamma' \vdash_\Sigma \beta_j : B_j$ for the scoped arguments in the extended context $\Gamma' = \Gamma \cdot (v_1, \alpha_1) \cdot \cdots \cdot (v_p, \alpha_p)$, then $\Gamma \vdash_\Sigma \beta[\alpha_1, \dots, \alpha_p\,;\,\beta_1, \dots, \beta_q] : r$.

The initial $\Sigma$-algebra $T_\Sigma$ is the sort-indexed family

\[ T_\Sigma \;=\; \bigl(T_\Sigma(\Gamma, s)\bigr)_{\Gamma,\, s}, \qquad T_\Sigma(\Gamma, s) = \{\, t \mid \Gamma \vdash_\Sigma t : s \,\}. \]

The reserved tags $\mathrm{Data}$ and $\mathrm{BoundVar}$ are not user-declarable; they are the framework-supplied leaf shapes.

1.3 Graph signatures¶

The graph-shaped fragment replaces (or augments) the inductive signature with declarations

vertex_kinds { K_1 : k_1 [dim d_1], ... }
edge_kinds   { ε_1 : K_a -> K_b, ε_2 : K_c -- K_d, ... }

A vertex kind is the type of a vertex; an edge kind is a typed binary relation $\varepsilon \subseteq K_{\mathrm{src}} \times K_{\mathrm{tgt}}$, with -> directed and -- undirected. The corresponding initial structure is the category $\mathbf{Graph}_\Sigma$ of finite typed graphs over the kind signature: objects of a fixed labeled multi-relational shape with payload data on each vertex. Encoders interpret graph signatures via message-passing rather than structural recursion (§2.4).

1.4 Parametric signatures¶

The optional [Π] parameter list introduces sort-level parameters $\pi_1, \dots, \pi_m$ that may appear inside sort, constructor, or binder declarations of the signature body. A parametric signature denotes a family of theories indexed by substitutions $\sigma$ assigning each $\pi_i$ to a concrete sort declared in the surrounding module; each $\sigma$ produces a concrete $\mathrm{Th}_{\Sigma(\sigma)}$ with its own initial algebra $T_{\Sigma(\sigma)}$.

2. Encoders¶

An encoder C over Σ declaration

encoder C over Σ [σ_args] {
    dim s = d_s , ...
    [iterations n]
    [readout |-> body_R]
    op_1(args_1)   [recurrent state | attention prefix] |-> body_1
    ...
    init κ(arg)    |-> body_init_κ                          # graph
    message[ε](s,t) |-> body_msg_ε                          # graph
    update[K](v,m)  |-> body_upd_K                          # graph
    var_init v_i [from A_i [as ty]] |-> body_var_i
}

or the factory form encoder C over Σ using F [options], denotes a $\Sigma$-algebra homomorphism from the initial algebra into a fixed-dimension vector carrier algebra.

2.1 The carrier algebra¶

The carrier is the sort-indexed product of Euclidean spaces

\[ \mathcal{A}_C \;=\; \bigl(\mathbb{R}^{d_s}\bigr)_{s \in S_{\mathrm{obj}} \cup S_{\mathrm{dat}}}, \]

with the dimension $d_s$ taken from the encoder's dim s = d_s clauses (defaulting to the signature's inline dim annotation when present, raising a compile-time error otherwise). Index sorts have no embedding; their semantics is purely positional.

The carrier is equipped with one operation per constructor and binder of $\Sigma$, each interpreted as a measurable map between the appropriate sort dimensions:

$\widehat c : \mathbb{R}^{d_{s_1}} \times \cdots \times \mathbb{R}^{d_{s_n}} \to \mathbb{R}^{d_t}$ for each constructor $c : s_1, \dots, s_n \to t$, interpreted by the let-expression $\mathrm{body}_c$;
$\widehat \beta : \mathbb{R}^{d_{A_1}} \times \cdots \times \mathbb{R}^{d_{A_p}} \times \mathbb{R}^{d_{B_1}} \times \cdots \times \mathbb{R}^{d_{B_q}} \to \mathbb{R}^{d_r}$ for each binder, interpreted by $\mathrm{body}_\beta$.

The variable-initializer declarations

var_init v from A as ty |-> body

specify, for each pair $(v, A)$ of a variable sort and an annotation sort the signature's binders introduce, a map

\[ \mathrm{init}_{v, A} : \mathbb{R}^{d_A} \to \mathbb{R}^{d_v}, \]

minting the bound variable's embedding from the embedding of its annotation. Unannotated binders default to a learnable nullary constant per variable sort.

2.2 Denotation: structural recursion¶

The encoder denotes the unique $\Sigma$-algebra homomorphism

\[ \llbracket C \rrbracket \;:\; T_\Sigma \;\longrightarrow\; \mathcal{A}_C \]

obtained by initial-algebra recursion threaded with an embedding context. For each sorted term $\Gamma \vdash_\Sigma t : s$ and each embedding context $\varepsilon$ of length $|\Gamma|$ with $\varepsilon_i \in \mathbb{R}^{d_{v_i}}$, the denotation $\llbracket C \rrbracket(t \mid \Gamma, \varepsilon) \in \mathbb{R}^{d_s}$ is defined by structural recursion on $t$:

\[ \begin{array}{rcl} \llbracket C \rrbracket(\mathrm{Data}_s(d) \mid \Gamma, \varepsilon) & = & \mathrm{embed}_s(d), \\[2pt] \llbracket C \rrbracket(\mathrm{BoundVar}(i) \mid \Gamma, \varepsilon) & = & \varepsilon_i, \\[2pt] \llbracket C \rrbracket(c(u_1, \dots, u_n) \mid \Gamma, \varepsilon) & = & \widehat c\bigl(\llbracket C \rrbracket(u_1 \mid \Gamma, \varepsilon),\, \dots,\, \llbracket C \rrbracket(u_n \mid \Gamma, \varepsilon)\bigr), \\[2pt] \llbracket C \rrbracket\bigl(\beta[\alpha_1, \dots; \beta_1, \dots] \mid \Gamma, \varepsilon\bigr) & = & \widehat \beta\Bigl(\llbracket C \rrbracket(\alpha_j \mid \Gamma, \varepsilon)_j,\; \llbracket C \rrbracket(\beta_k \mid \Gamma', \varepsilon')_k\Bigr). \end{array} \]

In the binder case the extended contexts $(\Gamma', \varepsilon')$ are

\[ \Gamma' \;=\; \Gamma \cdot (v_1, \alpha_1) \cdot \dots \cdot (v_p, \alpha_p), \qquad \varepsilon' \;=\; \varepsilon \cdot \mathrm{init}_{v_1, A_1}\bigl(\llbracket C \rrbracket(\alpha_1 \mid \Gamma, \varepsilon)\bigr) \cdot \dots \cdot \mathrm{init}_{v_p, A_p}\bigl(\llbracket C \rrbracket(\alpha_p \mid \Gamma, \varepsilon)\bigr), \]

so the scoped arguments $\beta_k$ are encoded with the bound variables' freshly-minted embeddings in scope. The closed-term denotation is the empty-context evaluation $\llbracket C \rrbracket(t) = \llbracket C \rrbracket(t \mid \varnothing, \varnothing)$.

The map $\mathrm{embed}_s : \mathrm{Vocab}_s \to \mathbb{R}^{d_s}$ is supplied at encoder construction. The compiler scaffolds a default for each data sort: a per-key learnable parameter dict that lazily allocates a fresh $\mathbb{R}^{d_s}$ vector the first time each leaf value is seen, which suffices for both closed and open vocabularies.

This is the standard catamorphism (initial-algebra fold) of an $F$-algebra (Lambek's lemma); the existence and uniqueness of $\llbracket C \rrbracket$ are by induction on the term structure.

2.3 Sequence-shaped sugar¶

The recurrent state and attention prefix modes on a constructor rule are syntactic sugar that fix the recursion shape on a Cons-like binary constructor whose second argument is the recursive child.

Recurrent. The rule $\mathrm{Cons}(h, t)\,\mathrm{recurrent}\,\mathrm{state} \mapsto \mathrm{body}(h,\,\mathrm{state})$ denotes the right-fold of the constructor body over the list:

$$ \llbracket C \rrbracket(\mathrm{Cons}(h_1, \mathrm{Cons}(h_2, \dots, \mathrm{Cons}(h_n, \mathrm{Nil})))) \;=\; \mathrm{body}\bigl(h_1,\,\mathrm{body}(h_2,\, \dots\,\mathrm{body}(h_n,\,\llbracket C \rrbracket(\mathrm{Nil}))\dots)\bigr), $$

with state bound at each step to the recursive child's already-computed embedding. The outermost body call sees $h_1$ (the head); the innermost sees $h_n$ together with the encoded $\mathrm{Nil}$.

Attention. The rule $\mathrm{Cons}(h, t)\,\mathrm{attention}\,\mathrm{prefix} \mapsto \mathrm{body}(h, \mathrm{prefix})$ denotes the outside-in walk

$$ \llbracket C \rrbracket(\mathrm{Cons}(h_1, \mathrm{Cons}(h_2, \dots))) \;=\; \mathrm{body}\bigl(h_k,\, [\llbracket C \rrbracket(h_1),\, \dots,\, \llbracket C \rrbracket(h_{k-1})]\bigr), $$

where prefix is the running list of prior non-recursive children's embeddings collected outside-in. The encoder's final embedding is the innermost step's output.

Both modes are derivable from the plain F-algebra rule by a choice of evaluation strategy; the sugar fixes which strategy the runtime uses.

2.4 Graph encoders¶

For a graph signature, the encoder iterates message-passing for $T = \mathrm{iterations}$ rounds:

\[ \begin{array}{rcll} v_\kappa^{(0)} & = & \mathrm{init}_\kappa(\mathrm{payload}_\kappa) & \text{per vertex of kind } \kappa, \\[2pt] m_\varepsilon^{(t)} & = & \mathrm{message}_\varepsilon\bigl(v_{\mathrm{src}}^{(t)},\, v_{\mathrm{tgt}}^{(t)}\bigr) & \text{per edge of kind } \varepsilon, \\[2pt] v_\kappa^{(t+1)} & = & \mathrm{update}_\kappa\bigl(v_\kappa^{(t)},\, \mathrm{agg}\{m_\varepsilon^{(t)} : \varepsilon \ni v_\kappa\}\bigr) & \\[2pt] \llbracket C \rrbracket(G) & = & \mathrm{readout}(v_\ast^{(T)}). & \end{array} \]

Undirected edges (--) emit messages in both directions; directed edges (->) only in the declared direction. The denotation is invariant under graph isomorphism and equivariant under the permutation action on vertices (Battaglia et al., 2018, §4); the readout aggregates the final vertex-state field into a graph-level vector.

2.5 Factory form¶

encoder C over Σ using F [options] denotes the encoder produced by the registry entry $F$ in quivers.structural.shapes applied to the signature and the keyword overrides. Each registry entry $F$ is a builder function (Σ, **options) ↦ Encoder returning an Encoder directly. Both the explicit form and the factory form denote a $\Sigma$-algebra homomorphism $T_\Sigma \to \mathcal{A}_C$; the two forms share the same denotation when the factory's builder produces a per-op function family equivalent to the explicit body.

3. Decoders¶

A decoder D : Σ [depth=d] declaration denotes, for each object sort $s \in S_{\mathrm{obj}}$ and each syntactic de Bruijn context $\Gamma$ (§1.2), a Markov kernel

\[ \llbracket D \rrbracket_{s, \Gamma} \;:\; \mathbb{R}^{d_s} \;\longrightarrow\; \mathcal{G}\bigl(T_\Sigma(\Gamma, s)\bigr), \]

producing for each carrier vector a probability distribution over closed signature terms of sort $s$ in context $\Gamma$. The kernel is defined by structural corecursion on the term shape: it is the unfold of an $F_\Sigma$-coalgebra in the Kleisli category of the Giry monad, where $F_\Sigma$ is the polynomial endofunctor

\[ F_\Sigma(X)(s) \;=\; \coprod_{\substack{c\,:\,s_1, \dots, s_n \to s\\ \text{constructors of } s}} \prod_{i=1}^n X(s_i)\;\;\sqcup\;\; \coprod_{\substack{\beta \to s\\ \text{binders of } s}} (\text{annot} \times \text{scoped})\;\;\sqcup\;\;\mathrm{BoundVar}_s, \]

associated to the signature $\Sigma$ (data leaves and bound variables are the constant summands). The depth bound $d$ caps the corecursion: every term in the support of $\llbracket D \rrbracket_s(v, \Gamma)$ has tree depth $\le d$ and consequently inhabits the (finite) initial algebra $T_\Sigma$.

3.1 The four parametric heads¶

A decoder body declares four per-sort logits-producing maps (each is a let-expression in the carrier vector):

Head	Signature	Role
$\mathrm{structure}_s(v)$	$\mathbb{R}^{d_s} \to \Delta^{	K_s
$\mathrm{primitive}_s(v)$	$\mathbb{R}^{d_s} \to \Delta^{	\mathrm{Vocab}_s
$\mathrm{factor}_s(v)$	$\mathbb{R}^{d_s} \to \prod_i \mathbb{R}^{d_{s_i}}$	Factorisation of the parent embedding into per-child embeddings; the per-constructor arity / dim sequence is selected by the chosen production
$\mathrm{binder\_select}_s(v)$	$\mathbb{R}^{d_s} \to \Delta^{	\Gamma_s

The required body |-> recursive clause selects the framework's structural corecursion as the body shape; recursive is the sole legal body keyword.

3.2 Denotation: structural corecursion¶

For carrier vector $v \in \mathbb{R}^{d_s}$, context $\Gamma$, and depth budget $d$, the kernel $\llbracket D \rrbracket(v, \Gamma, d)$ samples a term $t \in T_\Sigma(\Gamma, s)$ by the procedure:

If $s \in S_{\mathrm{dat}}$, sample $d^\star \sim \mathrm{primitive}(v)$ and return $\mathrm{Data}_s(d^\star)$.
Otherwise, sample a choice $k \sim \mathrm{structure}(v)$:
If $k = \mathrm{BoundVar}$ and there is an in-scope variable of sort $s$, sample $i \sim \mathrm{binder\_select}(v)$ over the matching positions of $\Gamma$ and return $\mathrm{BoundVar}(i)$.
If $k$ is a constructor $c : s_1, \dots, s_n \to s$, compute $(v_1, \dots, v_n) = \mathrm{factor}(v)$ and (recursively at depth $d - 1$) sample $u_i \sim \llbracket D \rrbracket(v_i, \Gamma, d - 1)$ for each $i$; return $c(u_1, \dots, u_n)$.
If $k$ is a binder $\beta$, compute the annotation embeddings via $\mathrm{factor}$, sample the annotation terms, extend the context, then sample the scoped children in the extended context. Return $\beta[\bar\alpha; \bar \beta]$.
When $d = 0$, restrict the support of $\mathrm{structure}(v)$ to the set of terminating productions: constructors / binders whose every child sort is in $S_{\mathrm{dat}} \cup S_{\mathrm{idx}}$, together with $\mathrm{BoundVar}$ (always terminating, available when $\Gamma_s$ is non-empty). If no terminating production exists at the current sort, the runtime raises a typed error.

This is a depth-bounded corecursion in the Kleisli category of $\mathcal{G}$ over the initial algebra $T_\Sigma$. The associated score kernel

\[ \mathrm{score}_D \;:\; T_\Sigma \times \mathbb{R}^{d_s} \times \mathrm{Ctx} \to \mathbb{R}_{\le 0}, \qquad \mathrm{score}_D(t, v, \Gamma) \;=\; \log P\bigl[t \mid \llbracket D \rrbracket(v, \Gamma)\bigr], \]

is the log-probability of producing the observed term $t$ under the same factorized distribution; it is computed by the recursion above with every "sample" replaced by "score against the observed branch."

3.3 Round-trip with the encoder¶

For a paired encoder / decoder over the same signature, the composite

\[ \llbracket D \rrbracket \circ \llbracket C \rrbracket \;:\; T_\Sigma \to \mathcal{G}(T_\Sigma) \]

is the reconstruction kernel. The reconstruction loss (§4.2 below) trains $C$ and $D$ jointly so that $\llbracket D \rrbracket(\llbracket C \rrbracket(t))$ concentrates near $\delta_t$. There is no general round-trip law: the pair $(C, D)$ does not form a section / retraction in $\mathbf{Kern}$ unless the encoder is invertible, which finite-dim encoders generically are not.

4. Losses¶

A loss L weight w on K T { body } declaration denotes a scalar functional on a training site.

4.1 Attachment sites¶

The optional on clause selects an attachment site; absent, the loss is global. The attachment kinds and their associated trace types are:

Attachment	Site	Trace passed to `body`
`global`	once per training step	the global environment $\eta$
`program P`	each invocation of program $P$	the program's sample / score trace $\tau_P$
`deduction Δ`	each call to deduction $\Delta$	the chart $\alpha_\Delta : I \to K$
`encoder C`	each call to $C$	the pair $(t, \llbracket C \rrbracket(t))$
`decoder D`	each call to $D$	the pair $(v, t)$ with $t \sim \llbracket D \rrbracket(v)$
`rule R in Δ`	every application of $R$ during $\Delta$'s chart build	the premise / conclusion items at the firing
`chart of Δ`	once on $\Delta$'s completed chart	the chart $\alpha_\Delta$

4.2 Denotation¶

For a loss declaration $L$ with weight $w$ (a let-expression in the trace), attachment kind $\kappa$, target $T$, and body $\mathrm{body}_L$, the denotation is the functional

\[ \llbracket L \rrbracket \;:\; \mathrm{Trace}_{\kappa, T} \to \mathbb{R}, \qquad \llbracket L \rrbracket(\tau) \;=\; \llbracket w \rrbracket(\tau)\, \cdot\, \llbracket \mathrm{body}_L \rrbracket(\tau). \]

The total loss of a module is the formal sum over every fire of every declared loss in a training step:

\[ \llbracket \mathrm{Total} \rrbracket(\eta) \;=\; \sum_{L,\,\tau \in \mathrm{Fires}(L,\,\eta)} \llbracket L \rrbracket(\tau). \]

Loss bodies are arbitrary let-expressions over the trace; canonical shapes include -D(C(input)).log_prob(input) (reconstruction), cross_entropy(p, q) (consistency), -chart.goal_weight() (parsing likelihood), and bce_with_logits(pred, target) (supervised prediction).

4.3 Composition with the inference layer¶

The total loss is the objective passed to the optimizer by the inference driver. It is not part of the kernel-denotation of any program: a program P : τ_in → τ_out denotes a Markov kernel, and losses attached to $P$ are functionals on its trace, not on its denotation. See Programs §6 for the role of conditioning and the inference layer.

5. References¶

Lambek, J. (1968). A fixpoint theorem for complete categories. Mathematische Zeitschrift 103(2), 151–161.
Cartmell, J. (1986). Generalised algebraic theories and contextual categories. Annals of Pure and Applied Logic 32, 209–243.
Battaglia, P. W. et al. (2018). Relational inductive biases, deep learning, and graph networks. arXiv:1806.01261.
Hutton, G. (1999). A tutorial on the universality and expressiveness of fold. Journal of Functional Programming 9(4), 355–372.

Head	Signature	Role
\(\mathrm{structure}_s(v)\)	$\mathbb{R}^{d_s} \to \Delta^{	K_s
\(\mathrm{primitive}_s(v)\)	$\mathbb{R}^{d_s} \to \Delta^{	\mathrm{Vocab}_s
\(\mathrm{factor}_s(v)\)	\(\mathbb{R}^{d_s} \to \prod_i \mathbb{R}^{d_{s_i}}\)	Factorisation of the parent embedding into per-child embeddings; the per-constructor arity / dim sequence is selected by the chosen production
\(\mathrm{binder\_select}_s(v)\)	$\mathbb{R}^{d_s} \to \Delta^{	\Gamma_s

Attachment	Site	Trace passed to `body`
`global`	once per training step	the global environment \(\eta\)
`program P`	each invocation of program \(P\)	the program's sample / score trace \(\tau_P\)
`deduction Δ`	each call to deduction \(\Delta\)	the chart \(\alpha_\Delta : I \to K\)
`encoder C`	each call to \(C\)	the pair \((t, \llbracket C \rrbracket(t))\)
`decoder D`	each call to \(D\)	the pair \((v, t)\) with \(t \sim \llbracket D \rrbracket(v)\)
`rule R in Δ`	every application of \(R\) during \(\Delta\)'s chart build	the premise / conclusion items at the firing
`chart of Δ`	once on \(\Delta\)'s completed chart	the chart \(\alpha_\Delta\)