Encoders¶
Encoder is a torch.nn.Module that realizes an F-algebra
homomorphism T_Σ → Vec_D from terms over a signature to
fixed-length vectors. The recursion is supplied by the framework;
the analyst supplies only the per-operation parametric functions
(or omits them and accepts the compiler's scaffolded 2-layer
MLP defaults with correct per-arg dimensions).
For binders, the framework threads a typed de-Bruijn context Γ
through the recursion: each binder's annotations are compressed
in the outer context, fresh variable embeddings are minted via
the encoder's var_init_fns, and the scoped arguments are
recursed under the extended context.
Graph signatures are compressed via forward_graph: per-vertex-kind
initial embedders, finitely many message-passing rounds with
per-edge-kind message functions and per-vertex-kind update
functions, finally a readout reducing the per-vertex final
embeddings to a single graph-level vector.
encoder
¶
Encoder runtime: F-algebra homomorphisms T_Σ -> Vec_D.
A Encoder is a torch.nn.Module that, given a closed
Term over a signature, returns its vector embedding. The
recursion structure is supplied by the framework (algebra
homomorphism); per-operation parametric functions are supplied by
the user (declared in a encoder { … } DSL block) or by the
default scaffolding.
Canonical term form
A term over Σ has exactly one of three argument shapes at each position, determined by the codomain sort's kind:
- object — the argument is a
Termwhoseopis a constructor of Σ, a binder of Σ, or the reserved built-in"BoundVar". - data — the argument is a raw Python value (string, int, float, …) consumed by the encoder's per-data-sort embedder.
- index — the argument is a non-negative integer denoting a de-Bruijn index into the current scope context Γ.
No Term("Data", …) or other wrappings are accepted at runtime;
the canonical form is the only form.
The reserved "BoundVar" op is the built-in de-Bruijn reference:
Term("BoundVar", (i,)) at an object position reads the i-th
in-scope variable's embedding from Γ.
Binders thread a Context of in-scope variables: each entry
records the variable's sort, its embedding, and the type term
captured at binding. Compressing a binder constructor:
- Compresses each of the binder's
bindsarguments in the OUTER context. - Synthesises a fresh variable embedding from each via the
encoder's
var_initfunction. - Pushes the new entries onto Γ and recurses on the
scopedarguments under the extended context.
Graph-shaped signatures are compressed by forward_graph:
per-vertex-kind init seeds initial embeddings, finitely many
message_passing rounds alternate per-edge-kind message and
per-vertex-kind update functions, and a readout reduces
the final per-vertex embeddings to one fixed-length vector.
Encoder
¶
Encoder(name: str, signature: Signature, sort_dims: dict[str, int], op_fns: dict[str, _PerOpFn], var_init_fns: dict[tuple[str, str] | str, Callable[[Tensor | None], Tensor]], data_embedders: dict[str, Callable[[DataLeaf], Tensor]], modules_owned: list[Module] | None = None, iterations: int = 0, init_fns: dict[str, Callable[[DataLeaf], Tensor]] | None = None, message_fns: dict[str, Callable[..., Tensor]] | None = None, update_fns: dict[str, Callable[..., Tensor]] | None = None, readout: Callable[[list[Tensor]], Tensor] | None = None)
Bases: Module
An F-algebra homomorphism T_Σ -> Vec_D realised as an
nn.Module.
Construction parameters
name : str
Identifier used in diagnostics.
signature : Signature
The Σ whose terms this encoder consumes.
sort_dims : dict[str, int]
Per-sort embedding dimension.
op_fns : dict[str, _PerOpFn]
One entry per constructor and binder of Σ. The
Compiler scaffolds defaults for omitted ops.
var_init : callable
(type_embedding) -> variable_embedding. Called at every
binder to mint a fresh variable embedding for each
introduced scope variable.
data_embedders : dict[str, callable]
One entry per data sort: (raw_value) -> embedding.
For graph-shaped signatures, additionally:
iterations : int
Number of message-passing rounds in forward_graph.
init_fns : dict[str, callable]
Per-vertex-kind initial embedders.
message_fns : dict[str, callable]
Per-edge-kind (src_e, tgt_e) -> message functions.
update_fns : dict[str, callable]
Per-vertex-kind (self_e, aggregated_msg) -> next_e.
readout : callable
(list[Vec_D]) -> Vec_D reducer.
Source code in src/quivers/structural/encoder.py
198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 | |
make_default_op_fn
¶
make_default_op_fn(op: str, arg_dims: tuple[int, ...], out_dim: int) -> tuple[Module, Callable[..., Tensor]]
Build a parametric per-op function as a 2-layer MLP over concatenated child embeddings.
arg_dims is the tuple of per-argument dimensions in the
order in which the children are passed to the function. For
binders this is (annot_dim_1, …, annot_dim_k, scoped_dim_1,
…, scoped_dim_m).
Source code in src/quivers/structural/encoder.py
101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 | |
make_default_var_init
¶
make_default_var_init(in_dim: int, out_dim: int) -> tuple[Module, Callable[[Tensor], Tensor]]
A parametric var_init(ty) function: an MLP mapping a type
embedding to a fresh variable embedding.
Source code in src/quivers/structural/encoder.py
144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 | |