Expressions

This page gives a compositional, big-step denotational semantics for the QVR expression sublanguage. The sublanguage is the body of let bindings and output declarations: it consists of morphism-valued combinators that take morphisms (and, for the parser combinator, grammar data) and produce new morphisms.

1. Syntactic class

Expressions \(e\) are generated by the grammar

\[ \begin{array}{rcl} e & ::= & x \;\big|\; \mathsf{identity}(\tau) \;\big|\; e_1 \mathbin{>\!\!>} e_2 \;\big|\; e_1 \mathbin{@} e_2 \;\big|\; e.\mathsf{marginalize}(\bar X) \\ & \big|\!\!\big| & \mathsf{fan}(e_1, \dots, e_n) \;\big|\; \mathsf{repeat}(e, n) \;\big|\; \mathsf{stack}(e, n) \;\big|\; \mathsf{scan}(e, \mathit{init}) \\ & \big|\!\!\big| & \mathsf{parser}(\mathit{args}) \;\big|\; \mathsf{ccg}(\mathit{args}) \;\big|\; \mathsf{lambek}(\mathit{args}) \end{array} \]

Each expression denotes a morphism in one of the three strata of Morphisms. We write \(\llbracket e \rrbracket_{\rho}\) for the denotation under semantic environment \(\rho\), and we suppress \(\rho\) when only morphism-valued names are at play.

2. Identifiers, identity, and core combinators

2.1 Identifiers

\[ \llbracket x \rrbracket_{\rho} \;=\; \rho_{\mathrm{mor}}(x). \]

Lookup fails (raising CompileError) if \(x \notin \mathrm{dom}(\rho_{\mathrm{mor}})\).

2.2 Identity

\[ \llbracket \mathsf{identity}(\tau) \rrbracket_{\rho} \;=\; 1_{\llbracket \tau \rrbracket_{\rho}}, \]

the categorical identity at the type denotation \(\llbracket \tau \rrbracket_{\rho}\). The stratum is determined by \(\tau\): a TypeExpr produces the identity in \(\mathcal{V}\text{-}\mathbf{Rel}\), while a SpaceExpr produces the identity in \(\mathbf{Kern}\).

2.3 Composition

\[ \llbracket e_1 \mathbin{>\!\!>} e_2 \rrbracket \;=\; \llbracket e_2 \rrbracket \circ \llbracket e_1 \rrbracket. \]

Well-typedness requires \(\mathrm{cod}(\llbracket e_1 \rrbracket) = \mathrm{dom}(\llbracket e_2 \rrbracket)\) in the appropriate category. The category is the smallest one containing both denotations under the embeddings of Morphisms §5.

2.4 Tensor

\[ \llbracket e_1 \mathbin{@} e_2 \rrbracket \;=\; \llbracket e_1 \rrbracket \otimes \llbracket e_2 \rrbracket, \]

the parallel product. This is the symmetric monoidal structure of \(\mathcal{V}\text{-}\mathbf{Rel}\) / \(\mathbf{Stoch}\) / \(\mathbf{Kern}\); see Morphisms §4.

2.5 Marginalisation

\[ \llbracket e.\mathsf{marginalize}(X_1, \dots, X_k) \rrbracket(z) \;=\; \bigoplus_{x_1, \dots, x_k} \llbracket e \rrbracket\bigl((x_1, \dots, x_k, \cdot), z\bigr). \]

In \(\mathbf{Stoch}\) and \(\mathbf{Kern}\) the join \(\bigoplus\) is integration against the marginal; in \(\mathcal{V}\text{-}\mathbf{Rel}\) it is the quantale join.

3. Tupling and replication

3.1 Fan

The \(n\)-ary tupling combinator

\[ \llbracket \mathsf{fan}(e_1, \dots, e_n) \rrbracket(x, (y_1, \dots, y_n)) \;=\; \prod_{i=1}^{n} \llbracket e_i \rrbracket(x, y_i) \]

is the Cartesian product (under the diagonal \(\Delta : X \to X^n\)) in the appropriate symmetric monoidal category. It is the morphism

\[ \langle \llbracket e_1 \rrbracket, \dots, \llbracket e_n \rrbracket \rangle : X \to Y_1 \otimes \cdots \otimes Y_n. \]

3.2 Stack

For \(e : X \to Y\) and \(n \in \mathbb{N}_{>0}\),

\[ \llbracket \mathsf{stack}(e, n) \rrbracket \;=\; \llbracket e \rrbracket \otimes \cdots \otimes \llbracket e \rrbracket \quad (\text{$n$ copies}), \]

a morphism \(X^{\otimes n} \to Y^{\otimes n}\).

3.3 Repeat

\[ \llbracket \mathsf{repeat}(e, n) \rrbracket \;=\; \underbrace{\llbracket e \rrbracket \circ \cdots \circ \llbracket e \rrbracket}_{n \text{ copies}}, \]

defined when \(\mathrm{dom}(\llbracket e \rrbracket) = \mathrm{cod}(\llbracket e \rrbracket)\), i.e.\ when \(\llbracket e \rrbracket\) is an endomorphism. With no explicit count, \(\mathsf{repeat}(e)\) denotes a single application of \(\llbracket e \rrbracket\) wrapped as a RepeatMorphism whose count is a learnable parameter; the \(n = 1\) default is a structural placeholder for downstream training.

3.4 Scan

For \(e : X \otimes S \to Y \otimes S\) and an initial state \(\mathit{init} \in \{ \mathrm{zeros}, \mathrm{learned} \}\) supplying \(s_0 : 1 \to S\),

\[ \llbracket \mathsf{scan}(e, \mathit{init}) \rrbracket \;=\; \mathrm{Tr}^{S}\bigl( \llbracket e \rrbracket \bigr) \circ s_0, \]

where \(\mathrm{Tr}^{S} : \mathcal{C}(X \otimes S, Y \otimes S) \to \mathcal{C}(X, Y)\) is the trace operator of the appropriate traced symmetric monoidal category \(\mathcal{C}\) (Joyal, Street & Verity 1996). Concretely:

4. Parser combinators

The combinators \(\mathsf{parser}\), \(\mathsf{ccg}\), \(\mathsf{lambek}\) denote chart parsers. We give the semantics in full generality for \(\mathsf{parser}\); the others are syntactic sugar fixing particular rule systems.

Let

\[ \mathit{args} \;=\; \bigl(\mathit{rules} = R,\ \mathit{categories} = C,\ \mathit{start} = s,\ \mathit{depth} = d,\ \mathit{constructors} = K\bigr), \]

with \(R\) a tuple of rule names, \(C\) a tuple of atomic category names, \(s\) the start symbol, \(d\) the parse depth, and \(K\) the category constructors permitted (slash, product, …).

Let \(\mathcal{C}_{C, K}\) be the free residuated–monoidal category on the atoms \(C\) closed under the constructors \(K\) (Grammar fragment). Let \(\Sigma_R\) be the rule system: a finite signature of unary and binary inference rules over \(\mathcal{C}_{C, K}\).

Then

\[ \llbracket \mathsf{parser}(\mathit{args}) \rrbracket : \mathrm{Token}^{*} \to \mathbf{1}, \]

is the chart-parser kernel: for an input string \(w \in \mathrm{Token}^{*}\) of length \(\le d\), it is the (semiring-weighted) sum over all \(\Sigma_R\)-derivations of the sequent \(w \vdash s\).

In the \(\mathcal{V}_{\mathrm{pf}}\)-semiring this is the inside-algorithm probability \(P(w \mid \mathit{rules})\); in the Boolean semiring it is the membership predicate \(w \in L(\Sigma_R, s)\). The denotation is independent of the parsing algorithm and depends only on the rule system.

5. Coherence and equational laws

Every well-typed expression denotes a morphism in a symmetric monoidal closed category (for the discrete and stochastic strata) or a symmetric monoidal category (for the continuous stratum). Consequently the following equations between denotations hold by construction:

Law Statement
Composition associativity \(\llbracket (e_1 \mathbin{>\!\!>} e_2) \mathbin{>\!\!>} e_3 \rrbracket = \llbracket e_1 \mathbin{>\!\!>} (e_2 \mathbin{>\!\!>} e_3) \rrbracket\)
Composition unit \(\llbracket \mathsf{identity}(\tau) \mathbin{>\!\!>} e \rrbracket = \llbracket e \rrbracket = \llbracket e \mathbin{>\!\!>} \mathsf{identity}(\tau') \rrbracket\)
Tensor associativity \(\llbracket (e_1 \mathbin{@} e_2) \mathbin{@} e_3 \rrbracket = \llbracket e_1 \mathbin{@} (e_2 \mathbin{@} e_3) \rrbracket\)
Tensor unit \(\llbracket e \mathbin{@} \mathsf{identity}(\mathbf{1}) \rrbracket = \llbracket e \rrbracket\)
Bifunctoriality \(\llbracket (e_1 \mathbin{>\!\!>} e_2) \mathbin{@} (e_3 \mathbin{>\!\!>} e_4) \rrbracket = \llbracket (e_1 \mathbin{@} e_3) \mathbin{>\!\!>} (e_2 \mathbin{@} e_4) \rrbracket\)
Symmetry \(\llbracket e_1 \mathbin{@} e_2 \rrbracket = \sigma_{Y_1, Y_2} \circ \llbracket e_2 \mathbin{@} e_1 \rrbracket \circ \sigma^{-1}_{X_1, X_2}\)

These are the equational theory of symmetric monoidal categories; together with the trace axioms (Joyal, Street & Verity 1996, §3), they constitute the equational semantics of QVR expressions.

The compiler does not normalise expressions modulo these laws — it computes a literal AST-driven tensor expression. The laws are nevertheless valid statements about denotations, and the Adequacy theorem confirms they are respected by the implementation up to the floating-point precision of the underlying tensor library.