ADR-9: LM-Augmented Development and Semantic Tooling
How Terp Network uses language models, semantic search, and code graph analysis to contribute to the protocol — transparently, verifiably, and with human accountability
ADR 9: LM-Augmented Development and Semantic Tooling
Changelog
- 2026-05-11: Initial draft
Status
DRAFT
Dependencies
- ADR-1: Standard Template and Design Guidelines — this ADR follows the format defined in ADR-1
- ADR-2: Expected Testing Library Design — TensorZero prompt templates encode testing patterns from ADR-2; LM-generated code must pass ADR-2 test gates
- ADR-5: HashMerchant — QMD collections include HashMerchant contract source and documentation; Trailmark blast radius analysis covers HashMerchant's privilege boundaries
Abstract
This ADR defines integrates language models (LMs), semantic search, and code graph analysis into the development workflow. It establishes the toolchain, data flow, and accountability requirements for any LM-assisted contribution — whether from internal agents, community participants, or automated pipelines. The goal is transparency: anyone interacting with Terp Network through LM tooling should understand what surfaces exist, how semantic data is curated, and how contributions are verified before merging.
Context and Problem Statement
Terp Network operates across a complex software surface: a Cosmos SDK chain (terp-core), CosmWasm contracts (cw-infuser, cw-shitstrap), deployment orchestration (o-line), frontend applications (terp-gui, terp-web-ui), and documentation. This surface exceeds what any single contributor can hold in working memory. Language models can accelerate development, but only if they have accurate, current context about the codebase and protocol decisions. Without a defined pipeline, LM contributions risk being untraceable, based on stale information, or inconsistent with settled architectural decisions.
The core problem: how do we make LM-assisted development transparent, auditable, and effective — so that the community can see what tooling is in use, what data feeds it, and how outputs are verified?
Decision Drivers
- Transparency: LM usage in development must be discoverable and auditable by the community
- Accuracy: LMs must operate on current, verified context — not stale or hallucinated state
- Reproducibility: Any semantic query or code graph analysis must be reproducible by another party
- Minimal friction: The pipeline must not add bureaucratic overhead that discourages adoption
- Composability: Each tool in the pipeline must be usable independently and in combination
- Security: LM-generated code and documentation must pass the same review gates as human-generated content
- Multi-repo awareness: The pipeline must span terp-core, CosmWasm contracts, o-line, and documentation simultaneously
Considered Options / Alternatives
- Ad-hoc LM usage per contributor: no shared context, no audit trail. Fast but unaccountable and inconsistent.
- Centralized RAG service: single hosted vector DB that all contributors query. Consistent but creates a trust bottleneck and single point of failure.
- Federated semantic pipeline with local-first tooling (selected): each contributor runs QMD locally against shared collection schemas; Trailmark builds reproducible code graphs; TensorZero orchestrates LM inference with verifiable prompt templates. Transparent, auditable, and no central trust requirement.
Decision Outcome
We adopt the federated semantic pipeline as the standard for LM-augmented development on Terp Network. This consists of three layers:
Layer 1: QMD — Semantic Search and Data Curation
QMD (Query Markdown Documents) is the primary semantic search engine for all Terp Network knowledge. It indexes:
- Source code — Rust (.rs), Go (.go), and other source files wrapped in markdown code fences before ingestion (QMD is markdown-native; non-markdown files require wrapping)
- Documentation — ADRs, module specs, guides, brain notes
- Configuration — TOML, YAML, JSON config files (wrapped in markdown fences)
Collections and their purposes:
| Collection | Content | Document Count | Purpose |
|---|---|---|---|
| oline-rust | O-line Rust source (wrapped) | 74 | Deployment orchestration codebase |
| oline-docs | O-line documentation | 9 | Operational procedures |
| oline-configs | Configuration files (wrapped) | 5 | Deployment and node configuration |
| oline-sdls | SDL templates (wrapped) | 19 | Akash deployment manifests |
| abstract | Cosmos SDK ADRs, IBC-go ADRs, ziavl ADRs, terp-core module specs | 90+ | Upstream decision landscape |
Data curation workflow:
- Source files are wrapped as markdown via
qmd-bulk-ingest.py(handles .rs, TOML, YAML) - Documents are ingested into named collections via
oline-qmdwrapper script - Collections are queryable via BM25 keyword search (
oline-qmd search) or hybrid semantic search (oline-qmd query) - All collections follow a shared schema convention — document IDs are path-based, enabling cross-referencing between code and documentation
Local-first, reproducible: Every contributor runs QMD against the same collection definitions. The global DB lives at ~/.oline/semantic/qmd-global.db. Collection contents can be rebuilt from source at any time using the bulk ingest scripts.
Layer 2: Trailmark — Code Graph Analysis
Trailmark builds multi-language source code graphs from Terp Network repositories. It provides:
- Structural analysis: call graphs, class hierarchies, module dependency maps, complexity heatmaps
- Security analysis: blast radius calculation, taint propagation, privilege boundary mapping, entry point enumeration
- Evolution tracking: graph comparison between commits/tags to surface security-relevant structural changes
- Audit augmentation: overlays SARIF findings and weAudit annotations onto code graph nodes
Integration with QMD: Trailmark output (structural summaries, taint paths, blast radii) can be ingested into QMD collections, making graph-derived context available to semantic search alongside raw source code.
Pipeline for new projects:
- Run Trailmark parse on the repository (auto-detects languages)
- Run structural analysis to identify hotspots, taint, and blast radius
- Ingest Trailmark summaries + raw source into QMD collections
- Query QMD to surface architectural context during development
Layer 3: TensorZero — LM Inference Orchestration
TensorZero provides structured, verifiable LM inference through:
- Prompt templates (MiniJinja): version-controlled prompt definitions that separate system instructions from variable context
- Functions and episodes: defined inference workflows with explicit input/output schemas
- Provider hooks: pluggable LM backends (local, remote, distributed) that can be swapped without changing prompt logic
- Episode tracking: each LM interaction is logged with its template, inputs, model, and output — creating an auditable trail
Integration pattern:
- TensorZero function definitions reference QMD query results as context variables
- Prompt templates encode Terp Network conventions (ADR format, module spec structure, testing patterns from ADR-2)
- Provider hooks route to the TensorZero inference server layer, which abstracts LM backend selection (local, remote API, distributed) from prompt logic
- Every LM-assisted contribution includes a reference to the TensorZero function/episode that generated it
Inference server layer: TensorZero's provider hook architecture separates prompt definitions from inference routing. The inference server layer handles:
- Backend selection and failover across LM providers
- Request routing based on function type and model capabilities
- Session-aware context management across multiple LM interactions within a workflow
Developer Tooling Specification
The following tools constitute the Terp Network developer workspace for LM-augmented development:
| Tool | Path | Purpose |
|---|---|---|
| oline CLI | ~/.cargo/bin/oline | Deployment orchestration, node management, IBC relayer ops |
| QMD CLI | ~/.oline/venv/bin/qmd | Semantic search and document indexing |
| oline-qmd wrapper | ~/.oline/scripts/oline-qmd | QMD with o-line venv + global DB preconfigured |
| qmd-bulk-ingest.py | ~/.oline/scripts/qmd-bulk-ingest.py | Bulk source file ingestion (Rust, TOML, YAML) |
| Trailmark | pip install | Code graph construction and security analysis |
| TensorZero | config/tensorzero.toml | LM inference orchestration and prompt management |
| terp-brain | Obsidian vault | Operational knowledge graph linking all domains |
| oline config | ~/.oline/config.toml | Chain, deployment, and node configuration |
Adding a new project to the pipeline:
- Run
trailmark parseon the new repository - Run
qmd-bulk-ingest.py --src-dir <repo>/src --collection <name>to index source - Index any documentation into the appropriate QMD collection
- Create TensorZero function definitions for LM tasks specific to the project
- Document the new collection and its purpose in this ADR's collection table
Consequences
Positive
- LM contributions are transparent and auditable — every output traces back to a TensorZero episode with known inputs
- Semantic search across the entire Terp surface eliminates stale-context errors
- Trailmark security analysis catches architectural issues before they reach review
- Local-first design means no central trust bottleneck — anyone can reproduce any query
- The pipeline composes: QMD ↔ Trailmark ↔ TensorZero can be used independently or end-to-end
Negative
- Local QMD + Trailmark setup requires initial effort per contributor (venv, collections, ingestion)
- TensorZero prompt template maintenance is ongoing work
- Federated model means collection schemas must be documented and followed consistently
Neutral / Trade-offs
- QMD's markdown-only requirement means non-markdown files always need wrapping — this is a deliberate trade-off for index simplicity
- Trailmark analysis adds latency to the onboarding of new repositories, but pays off in reduced review cycles
- The inference server layer architecture may evolve as provider hook configurations stabilize
Backwards Compatibility
Fully additive. No existing code, APIs, or on-chain behavior changes. This ADR defines a workflow and toolchain specification — it does not modify terp-core, CosmWasm contracts, or any deployed software. Existing contributors not using LM tooling are unaffected.
Test Cases
- QMD: verify
oline-qmd search 'store migrations' -c abstractreturns ADR-041 and related documents - QMD: verify
oline-qmd collection listreturns all documented collections with expected document counts - Trailmark: verify
trailmark parseon o-line source produces a graph with expected node/edge counts - Trailmark: verify blast radius analysis identifies the upgrade handler as a high-centrality node
- TensorZero: verify a function definition with MiniJinja template renders correctly with variable substitution
- Integration: verify QMD query results can be passed as context to a TensorZero function and produce a valid LM output
- Bulk ingest: verify
qmd-bulk-ingest.py --dry-runlists expected .rs files without writing to DB
Further Discussions / Open Questions
- Should QMD collections be version-locked to git commits/tags for full reproducibility?
- What is the minimum TensorZero episode metadata that must accompany an LM-assisted PR?
- Should inference server session logs be ingested into QMD for after-action review?
- How to handle QMD collection schema migrations when the ingestion format changes?
- Community contribution guidelines: what disclosure is required when using LM tooling?
References
- ADR-1: Standard Template and Design Guidelines
- ADR-2: Expected Testing Library Design
~/.oline/scripts/oline-qmd— QMD wrapper script~/.oline/scripts/qmd-bulk-ingest.py— bulk ingestion script~/.oline/config.toml— o-line configuration- TensorZero documentation: https://www.tensorzero.com/docs
- Trailmark skill documentation (Hermes skill: trailmark)
- QMD setup notes: Homebrew Python 3.13 venv at
~/.oline/venv/, global DB at~/.oline/semantic/qmd-global.db