How Terp Network uses language models, semantic search, and code graph analysis to contribute to the protocol — transparently, verifiably, and with human accountability

ADR 9: LM-Augmented Development and Semantic Tooling

Dependencies

ADR-1: Standard Template and Design Guidelines — this ADR follows the format defined in ADR-1
ADR-2: Expected Testing Library Design — TensorZero prompt templates encode testing patterns from ADR-2; LM-generated code must pass ADR-2 test gates
ADR-5: HashMerchant — QMD collections include HashMerchant contract source and documentation; Trailmark blast radius analysis covers HashMerchant's privilege boundaries

This ADR defines integrates language models (LMs), semantic search, and code graph analysis into the development workflow. It establishes the toolchain, data flow, and accountability requirements for any LM-assisted contribution — whether from internal agents, community participants, or automated pipelines. The goal is transparency: anyone interacting with Terp Network through LM tooling should understand what surfaces exist, how semantic data is curated, and how contributions are verified before merging.

Context and Problem Statement

Terp Network operates across a complex software surface: a Cosmos SDK chain (terp-core), CosmWasm contracts (cw-infuser, cw-shitstrap), deployment orchestration (o-line), frontend applications (terp-gui, terp-web-ui), and documentation. This surface exceeds what any single contributor can hold in working memory. Language models can accelerate development, but only if they have accurate, current context about the codebase and protocol decisions. Without a defined pipeline, LM contributions risk being untraceable, based on stale information, or inconsistent with settled architectural decisions.

The core problem: how do we make LM-assisted development transparent, auditable, and effective — so that the community can see what tooling is in use, what data feeds it, and how outputs are verified?

Decision Drivers

Transparency: LM usage in development must be discoverable and auditable by the community
Accuracy: LMs must operate on current, verified context — not stale or hallucinated state
Reproducibility: Any semantic query or code graph analysis must be reproducible by another party
Minimal friction: The pipeline must not add bureaucratic overhead that discourages adoption
Composability: Each tool in the pipeline must be usable independently and in combination
Security: LM-generated code and documentation must pass the same review gates as human-generated content
Multi-repo awareness: The pipeline must span terp-core, CosmWasm contracts, o-line, and documentation simultaneously

Considered Options / Alternatives

Ad-hoc LM usage per contributor: no shared context, no audit trail. Fast but unaccountable and inconsistent.
Centralized RAG service: single hosted vector DB that all contributors query. Consistent but creates a trust bottleneck and single point of failure.
Federated semantic pipeline with local-first tooling (selected): each contributor runs QMD locally against shared collection schemas; Trailmark builds reproducible code graphs; TensorZero orchestrates LM inference with verifiable prompt templates. Transparent, auditable, and no central trust requirement.

Decision Outcome

We adopt the federated semantic pipeline as the standard for LM-augmented development on Terp Network. This consists of three layers:

Layer 1: QMD — Semantic Search and Data Curation

QMD (Query Markdown Documents) is the primary semantic search engine for all Terp Network knowledge. It indexes:

Source code — Rust (.rs), Go (.go), and other source files wrapped in markdown code fences before ingestion (QMD is markdown-native; non-markdown files require wrapping)
Documentation — ADRs, module specs, guides, brain notes
Configuration — TOML, YAML, JSON config files (wrapped in markdown fences)

Collections and their purposes:

Collection	Content	Document Count	Purpose
oline-rust	O-line Rust source (wrapped)	74	Deployment orchestration codebase
oline-docs	O-line documentation	9	Operational procedures
oline-configs	Configuration files (wrapped)	5	Deployment and node configuration
oline-sdls	SDL templates (wrapped)	19	Akash deployment manifests
abstract	Cosmos SDK ADRs, IBC-go ADRs, ziavl ADRs, terp-core module specs	90+	Upstream decision landscape

Data curation workflow:

Source files are wrapped as markdown via qmd-bulk-ingest.py (handles .rs, TOML, YAML)
Documents are ingested into named collections via oline-qmd wrapper script
Collections are queryable via BM25 keyword search (oline-qmd search) or hybrid semantic search (oline-qmd query)
All collections follow a shared schema convention — document IDs are path-based, enabling cross-referencing between code and documentation

Local-first, reproducible: Every contributor runs QMD against the same collection definitions. The global DB lives at ~/.oline/semantic/qmd-global.db. Collection contents can be rebuilt from source at any time using the bulk ingest scripts.

Layer 2: Trailmark — Code Graph Analysis

Trailmark builds multi-language source code graphs from Terp Network repositories. It provides:

Structural analysis: call graphs, class hierarchies, module dependency maps, complexity heatmaps
Security analysis: blast radius calculation, taint propagation, privilege boundary mapping, entry point enumeration
Evolution tracking: graph comparison between commits/tags to surface security-relevant structural changes
Audit augmentation: overlays SARIF findings and weAudit annotations onto code graph nodes

Integration with QMD: Trailmark output (structural summaries, taint paths, blast radii) can be ingested into QMD collections, making graph-derived context available to semantic search alongside raw source code.

Pipeline for new projects:

Run Trailmark parse on the repository (auto-detects languages)
Run structural analysis to identify hotspots, taint, and blast radius
Ingest Trailmark summaries + raw source into QMD collections
Query QMD to surface architectural context during development

Layer 3: TensorZero — LM Inference Orchestration

TensorZero provides structured, verifiable LM inference through:

Prompt templates (MiniJinja): version-controlled prompt definitions that separate system instructions from variable context
Functions and episodes: defined inference workflows with explicit input/output schemas
Provider hooks: pluggable LM backends (local, remote, distributed) that can be swapped without changing prompt logic
Episode tracking: each LM interaction is logged with its template, inputs, model, and output — creating an auditable trail

Integration pattern:

TensorZero function definitions reference QMD query results as context variables
Prompt templates encode Terp Network conventions (ADR format, module spec structure, testing patterns from ADR-2)
Provider hooks route to the TensorZero inference server layer, which abstracts LM backend selection (local, remote API, distributed) from prompt logic
Every LM-assisted contribution includes a reference to the TensorZero function/episode that generated it

Inference server layer: TensorZero's provider hook architecture separates prompt definitions from inference routing. The inference server layer handles:

Backend selection and failover across LM providers
Request routing based on function type and model capabilities
Session-aware context management across multiple LM interactions within a workflow

Developer Tooling Specification

The following tools constitute the Terp Network developer workspace for LM-augmented development:

Tool	Path	Purpose
oline CLI	`~/.cargo/bin/oline`	Deployment orchestration, node management, IBC relayer ops
QMD CLI	`~/.oline/venv/bin/qmd`	Semantic search and document indexing
oline-qmd wrapper	`~/.oline/scripts/oline-qmd`	QMD with o-line venv + global DB preconfigured
qmd-bulk-ingest.py	`~/.oline/scripts/qmd-bulk-ingest.py`	Bulk source file ingestion (Rust, TOML, YAML)
Trailmark	pip install	Code graph construction and security analysis
TensorZero	config/tensorzero.toml	LM inference orchestration and prompt management
terp-brain	Obsidian vault	Operational knowledge graph linking all domains
oline config	`~/.oline/config.toml`	Chain, deployment, and node configuration

Adding a new project to the pipeline:

Run trailmark parse on the new repository
Run qmd-bulk-ingest.py --src-dir <repo>/src --collection <name> to index source
Index any documentation into the appropriate QMD collection
Create TensorZero function definitions for LM tasks specific to the project
Document the new collection and its purpose in this ADR's collection table

Consequences

Positive

LM contributions are transparent and auditable — every output traces back to a TensorZero episode with known inputs
Semantic search across the entire Terp surface eliminates stale-context errors
Trailmark security analysis catches architectural issues before they reach review
Local-first design means no central trust bottleneck — anyone can reproduce any query
The pipeline composes: QMD ↔ Trailmark ↔ TensorZero can be used independently or end-to-end

Negative

Local QMD + Trailmark setup requires initial effort per contributor (venv, collections, ingestion)
TensorZero prompt template maintenance is ongoing work
Federated model means collection schemas must be documented and followed consistently

Neutral / Trade-offs

QMD's markdown-only requirement means non-markdown files always need wrapping — this is a deliberate trade-off for index simplicity
Trailmark analysis adds latency to the onboarding of new repositories, but pays off in reduced review cycles
The inference server layer architecture may evolve as provider hook configurations stabilize

Backwards Compatibility

Fully additive. No existing code, APIs, or on-chain behavior changes. This ADR defines a workflow and toolchain specification — it does not modify terp-core, CosmWasm contracts, or any deployed software. Existing contributors not using LM tooling are unaffected.

Test Cases

QMD: verify oline-qmd search 'store migrations' -c abstract returns ADR-041 and related documents
QMD: verify oline-qmd collection list returns all documented collections with expected document counts
Trailmark: verify trailmark parse on o-line source produces a graph with expected node/edge counts
Trailmark: verify blast radius analysis identifies the upgrade handler as a high-centrality node
TensorZero: verify a function definition with MiniJinja template renders correctly with variable substitution
Integration: verify QMD query results can be passed as context to a TensorZero function and produce a valid LM output
Bulk ingest: verify qmd-bulk-ingest.py --dry-run lists expected .rs files without writing to DB

Further Discussions / Open Questions

Should QMD collections be version-locked to git commits/tags for full reproducibility?
What is the minimum TensorZero episode metadata that must accompany an LM-assisted PR?
Should inference server session logs be ingested into QMD for after-action review?
How to handle QMD collection schema migrations when the ingestion format changes?
Community contribution guidelines: what disclosure is required when using LM tooling?

References

ADR-1: Standard Template and Design Guidelines
ADR-2: Expected Testing Library Design
~/.oline/scripts/oline-qmd — QMD wrapper script
~/.oline/scripts/qmd-bulk-ingest.py — bulk ingestion script
~/.oline/config.toml — o-line configuration
TensorZero documentation: https://www.tensorzero.com/docs
Trailmark skill documentation (Hermes skill: trailmark)
QMD setup notes: Homebrew Python 3.13 venv at ~/.oline/venv/, global DB at ~/.oline/semantic/qmd-global.db

ADR-9: LM-Augmented Development and Semantic Tooling