Migrating .qvr source between grammar revisions¶
QVR's surface grammar evolves between releases. qvr migrate lowers
.qvr source written for one tagged grammar revision into source
shaped for a later revision. The transformation is grammar-bound at
every step: every per-declaration output is parse-validated against
the target revision's grammar, and the assembled output is
parse-validated as a whole file before being written.
This page covers what the migrator does today, how to use it, and what's still pending.
What qvr migrate does¶
The pipeline runs per source file:
flowchart LR
A["source bytes<br/>(written against revision X)"]
B["parse with X's<br/>tree-sitter grammar"]
C["walk the parse-tree<br/>schema"]
D["per-declaration<br/>converter (X → Y)"]
E["per-declaration<br/>parse-validate (Y)"]
F["concatenate"]
G["whole-file<br/>parse-validate (Y)"]
H["source bytes<br/>(shaped for revision Y)"]
A --> B --> C --> D --> E --> F --> G --> H
Each adjacent revision pair (X, Y) on the migration
CHAIN has its own converter module under
src/quivers/cli/migrations/vX_Y_Z_to_vA_B_C.py. Migrating across
multiple revisions composes the intermediate hops.
The per-revision tree-sitter parsers live at
grammars/qvr/vcs/parsers/<rev>/qvr.{dylib,so,dll}; the migration
schemas live in the panproto VCS at
grammars/qvr/vcs/.panproto/.
Common invocations¶
Migrate one file in place to the latest release:
qvr migrate docs/examples/source/lda.qvr
Migrate every .qvr under a directory:
qvr migrate docs/examples/source/
Pick specific revisions explicitly:
qvr migrate --from v0.10.0 --to v0.11.0 docs/examples/source/lda.qvr
Dry-run (report what would change, write nothing):
qvr migrate --dry-run docs/examples/source/
Write migrated copies to a separate directory:
qvr migrate --output /tmp/migrated docs/examples/source/
Run the coverage check against the migration chain without migrating any files:
qvr migrate --check
--from defaults to the most recent released revision on the chain
(the penultimate entry of CHAIN). --to defaults to HEAD, the
working-tree grammar.
What survives migration¶
What's preserved by the migrator today:
- Every declaration's semantics. Each source decl becomes the
semantically-equivalent target decl, even when the surface
changed (e.g.
latent f : A -> Bbecomesmorphism f : A -> B [role=latent]). - Top-level comments. Header comments (file preamble, between-decl explanations) pass through verbatim.
- In-body comments. Comments inside
program,deduction,marginalize,signature,encoder,decoder,loss, and composition rule bodies pass through as their raw source text, interleaved with the (translated) structural body entries in document order. - Lexicon block comments. Comments between lexicon entries survive.
- Doc comments.
#!doc comment lines attached to a declaration migrate with that declaration. - Multi-line bracketed forms. Where the source revision allows
multi-line
[...]/(...)/{...}and the source has interior comments, the migrator'semit_bracketed_listhelper preserves them.
What's intentionally dropped or transformed:
- Single-line interior comments. A comment inside an
inline-form bracketed list (e.g.
[role=latent, # comment\n over=cod]written without a leading newline) cannot exist in the grammar: the inline form forbids newlines. The user must switch to the multi-line bracket form to retain such a comment. - Body keywords that became option entries. v0.10.0's
deductionbody carriedsemiring LogProb,start S,depth 6on their own lines; these hoist into the header option block as[semiring=LogProb, start=S, depth=6]. Same for program effects (! Score, Sample→[effects=[Score, Sample]]) and marginalize plates (over G→[over=G]).
The migration chain¶
The chain is declared in
src/quivers/cli/migrations/__init__.py
as the tuple CHAIN. Each adjacent pair has a module:
| Pair | Status |
|---|---|
v0.2.0 → v0.3.0 |
identity scaffold (no converters yet) |
v0.3.0 → v0.4.0 |
identity scaffold |
v0.4.0 → v0.5.0 |
identity scaffold |
v0.5.0 → v0.6.0 |
identity scaffold |
v0.6.0 → v0.7.0 |
identity scaffold |
v0.7.0 → v0.9.0 |
identity scaffold |
v0.9.0 → v0.10.0 |
identity (grammar byte-identical) |
v0.10.0 → v0.11.0 |
full homogenization hop (all in-tree examples) |
The 0.10.0 → 0.11.0 hop is the only one with full converters
today. The earlier hops parse and pass through their source
unchanged. They become non-trivial as users present older source
files that need lowering; the
SOURCE_RULE_COVERAGE
machinery makes the missing converters discoverable.
--check mode: coverage against the VCS¶
The panproto VCS at grammars/qvr/vcs/.panproto/ holds one commit
per distinct grammar revision. qvr migrate --check walks every
adjacent pair on CHAIN, computes
panproto.diff_schemas(src_schema, tgt_schema), and reports:
- added: rules that appear in the target's grammar but not the source's.
- removed: rules that appear in the source's grammar but not the target's.
- UNCOVERED removed rules: rules removed at the target whose
corresponding hop migrator has no entry in its
SOURCE_RULE_COVERAGEset. Each one is a missing converter that will silently let source bytes through; the resulting target source will be invalid.
The command exits non-zero when any pair has uncovered removals, which makes it CI-suitable.
Sample output:
v0.6.0 -> v0.7.0:
removed: quantale_decl
added: algebra_decl
UNCOVERED removed rules (no converter): quantale_decl
v0.10.0 -> v0.11.0:
removed: (the homogenization-removed kinds)
added: (object_decl, morphism_decl, composition_decl, ...)
all removed rules have converters [OK]
To clear an "uncovered" entry: write a converter for the rule in
the corresponding hop module and add the rule name to that
module's SOURCE_RULE_COVERAGE frozenset.
VCS blame on migration failure¶
When a migrator encounters a top-level declaration whose kind it
has no converter for, it queries the panproto VCS for the rule's
history and writes a diagnostic to stderr alongside the
pass-through:
qvr migrate [v0.5.0 -> v0.6.0]: no converter for 'continuous_decl'.
VCS blame: introduced at v0.4.0; last present at v0.4.0.
This points the user at the precise release that needs a converter written. The migration continues with the source bytes passed through verbatim, which usually surfaces as a final-stage parse error against the target grammar.
Adding a new release¶
When a new QVR release ships:
- Tag the release in git:
git tag v0.X.Y. - Rebuild the VCS schema chain:
Adds a new commit to
python grammars/qvr/vcs/build_schemas.pygrammars/qvr/vcs/.panproto/only if the taggedgrammars/qvr/grammar.jsdiffers in bytes from the previous tag's. Releases with identical grammars share commits (seev0.10.0/v0.9.0today). - Rebuild the per-revision parser:
Produces
python grammars/qvr/vcs/build_parsers.pygrammars/qvr/vcs/parsers/v0.X.Y/qvr.{dylib,so,dll}. - Append the new revision to
CHAINinsrc/quivers/cli/migrations/__init__.py. - If the new grammar differs structurally: write
vP_Q_R_to_v0_X_Y.pywith per-decl converters and aSOURCE_RULE_COVERAGEfrozenset listing every source rule it handles. Register it inMIGRATORS. - If the new grammar is byte-identical to the previous release:
add a tiny identity module like
v0_9_0_to_v0_10_0.py(amigratefunction that returns its argument unchanged) and register it inMIGRATORS. - Run
qvr migrate --checkto confirm the new hop's coverage is complete.
The panproto VCS chain¶
grammars/qvr/vcs/ holds a panproto repository whose commits track
grammar evolution.
- One commit per distinct grammar revision; each commit holds a
panproto
Schemawhose vertices are the rule names in the grammar'sgrammar.jsonand whose edges are the structural fan-out between rules. Vertices keyed by rule name means panproto's auto-derivation recognizes unchanged rules in O(1). - Each commit tagged with the matching git tag (
v0.X.Y); the working-tree grammar commits un-tagged. - Used by
qvr migrate --checkto compute schema diffs and by the blame diagnostic to identify when a rule was introduced or removed.
The Python migrators do NOT consult the VCS at runtime to PERFORM the migration; they are hand-written walks over the parsed source schema. The VCS provides authoritative grammar history and powers the coverage / blame tooling layered on top.
Limitations and planned work¶
- Earlier hops are identity scaffolds. Migrating source from
v0.2.0–v0.8.0 lineage will pass the source through unchanged at
every hop until those modules are filled in.
qvr migrate --checklists the rules each hop still needs converters for. - Interior-bracket comments in inline forms. A
#comment inside a single-line[...]/(...)/{...}cannot exist: the grammar forbids newlines in inline forms. The user must switch to multi-line form (newline immediately after the opener) to retain interior comments. - No backward migration.
qvr migrateonly composes forward alongCHAIN. Backward migration (rendering newer source as older) is not implemented. - No Schema-construction emit. The migrator currently emits
target source via per-declaration text construction validated
through
lens.parse, not viaSchemaBuilder+emit_pretty. The construction-by-Schema path depends on several panproto upstream issues to resolve before it's the default; see the panproto issues filed by quivers.
Related¶
grammars/qvr/vcs/README.md: the VCS scaffolding for grammar authors.- The
DSL overview
for the current (
v0.11.0-shaped) source-level surface.