Terp Network Docs

ADR-9: LM-Augmented Development and Semantic Tooling

How Terp Network uses language models, semantic search, and code graph analysis to contribute to the protocol — transparently, verifiably, and with human accountability

ADR 9: LM-Augmented Development and Semantic Tooling

Changelog

  • 2026-05-11: Initial draft

Status

DRAFT

Dependencies

  • ADR-1: Standard Template and Design Guidelines — this ADR follows the format defined in ADR-1
  • ADR-2: Expected Testing Library Design — TensorZero prompt templates encode testing patterns from ADR-2; LM-generated code must pass ADR-2 test gates
  • ADR-5: HashMerchant — QMD collections include HashMerchant contract source and documentation; Trailmark blast radius analysis covers HashMerchant's privilege boundaries

Abstract

This ADR defines integrates language models (LMs), semantic search, and code graph analysis into the development workflow. It establishes the toolchain, data flow, and accountability requirements for any LM-assisted contribution — whether from internal agents, community participants, or automated pipelines. The goal is transparency: anyone interacting with Terp Network through LM tooling should understand what surfaces exist, how semantic data is curated, and how contributions are verified before merging.

Context and Problem Statement

Terp Network operates across a complex software surface: a Cosmos SDK chain (terp-core), CosmWasm contracts (cw-infuser, cw-shitstrap), deployment orchestration (o-line), frontend applications (terp-gui, terp-web-ui), and documentation. This surface exceeds what any single contributor can hold in working memory. Language models can accelerate development, but only if they have accurate, current context about the codebase and protocol decisions. Without a defined pipeline, LM contributions risk being untraceable, based on stale information, or inconsistent with settled architectural decisions.

The core problem: how do we make LM-assisted development transparent, auditable, and effective — so that the community can see what tooling is in use, what data feeds it, and how outputs are verified?

Decision Drivers

  • Transparency: LM usage in development must be discoverable and auditable by the community
  • Accuracy: LMs must operate on current, verified context — not stale or hallucinated state
  • Reproducibility: Any semantic query or code graph analysis must be reproducible by another party
  • Minimal friction: The pipeline must not add bureaucratic overhead that discourages adoption
  • Composability: Each tool in the pipeline must be usable independently and in combination
  • Security: LM-generated code and documentation must pass the same review gates as human-generated content
  • Multi-repo awareness: The pipeline must span terp-core, CosmWasm contracts, o-line, and documentation simultaneously

Considered Options / Alternatives

  • Ad-hoc LM usage per contributor: no shared context, no audit trail. Fast but unaccountable and inconsistent.
  • Centralized RAG service: single hosted vector DB that all contributors query. Consistent but creates a trust bottleneck and single point of failure.
  • Federated semantic pipeline with local-first tooling (selected): each contributor runs QMD locally against shared collection schemas; Trailmark builds reproducible code graphs; TensorZero orchestrates LM inference with verifiable prompt templates. Transparent, auditable, and no central trust requirement.

Decision Outcome

We adopt the federated semantic pipeline as the standard for LM-augmented development on Terp Network. This consists of three layers:

Layer 1: QMD — Semantic Search and Data Curation

QMD (Query Markdown Documents) is the primary semantic search engine for all Terp Network knowledge. It indexes:

  • Source code — Rust (.rs), Go (.go), and other source files wrapped in markdown code fences before ingestion (QMD is markdown-native; non-markdown files require wrapping)
  • Documentation — ADRs, module specs, guides, brain notes
  • Configuration — TOML, YAML, JSON config files (wrapped in markdown fences)

Collections and their purposes:

CollectionContentDocument CountPurpose
oline-rustO-line Rust source (wrapped)74Deployment orchestration codebase
oline-docsO-line documentation9Operational procedures
oline-configsConfiguration files (wrapped)5Deployment and node configuration
oline-sdlsSDL templates (wrapped)19Akash deployment manifests
abstractCosmos SDK ADRs, IBC-go ADRs, ziavl ADRs, terp-core module specs90+Upstream decision landscape

Data curation workflow:

  1. Source files are wrapped as markdown via qmd-bulk-ingest.py (handles .rs, TOML, YAML)
  2. Documents are ingested into named collections via oline-qmd wrapper script
  3. Collections are queryable via BM25 keyword search (oline-qmd search) or hybrid semantic search (oline-qmd query)
  4. All collections follow a shared schema convention — document IDs are path-based, enabling cross-referencing between code and documentation

Local-first, reproducible: Every contributor runs QMD against the same collection definitions. The global DB lives at ~/.oline/semantic/qmd-global.db. Collection contents can be rebuilt from source at any time using the bulk ingest scripts.

Layer 2: Trailmark — Code Graph Analysis

Trailmark builds multi-language source code graphs from Terp Network repositories. It provides:

  • Structural analysis: call graphs, class hierarchies, module dependency maps, complexity heatmaps
  • Security analysis: blast radius calculation, taint propagation, privilege boundary mapping, entry point enumeration
  • Evolution tracking: graph comparison between commits/tags to surface security-relevant structural changes
  • Audit augmentation: overlays SARIF findings and weAudit annotations onto code graph nodes

Integration with QMD: Trailmark output (structural summaries, taint paths, blast radii) can be ingested into QMD collections, making graph-derived context available to semantic search alongside raw source code.

Pipeline for new projects:

  1. Run Trailmark parse on the repository (auto-detects languages)
  2. Run structural analysis to identify hotspots, taint, and blast radius
  3. Ingest Trailmark summaries + raw source into QMD collections
  4. Query QMD to surface architectural context during development

Layer 3: TensorZero — LM Inference Orchestration

TensorZero provides structured, verifiable LM inference through:

  • Prompt templates (MiniJinja): version-controlled prompt definitions that separate system instructions from variable context
  • Functions and episodes: defined inference workflows with explicit input/output schemas
  • Provider hooks: pluggable LM backends (local, remote, distributed) that can be swapped without changing prompt logic
  • Episode tracking: each LM interaction is logged with its template, inputs, model, and output — creating an auditable trail

Integration pattern:

  1. TensorZero function definitions reference QMD query results as context variables
  2. Prompt templates encode Terp Network conventions (ADR format, module spec structure, testing patterns from ADR-2)
  3. Provider hooks route to the TensorZero inference server layer, which abstracts LM backend selection (local, remote API, distributed) from prompt logic
  4. Every LM-assisted contribution includes a reference to the TensorZero function/episode that generated it

Inference server layer: TensorZero's provider hook architecture separates prompt definitions from inference routing. The inference server layer handles:

  • Backend selection and failover across LM providers
  • Request routing based on function type and model capabilities
  • Session-aware context management across multiple LM interactions within a workflow

Developer Tooling Specification

The following tools constitute the Terp Network developer workspace for LM-augmented development:

ToolPathPurpose
oline CLI~/.cargo/bin/olineDeployment orchestration, node management, IBC relayer ops
QMD CLI~/.oline/venv/bin/qmdSemantic search and document indexing
oline-qmd wrapper~/.oline/scripts/oline-qmdQMD with o-line venv + global DB preconfigured
qmd-bulk-ingest.py~/.oline/scripts/qmd-bulk-ingest.pyBulk source file ingestion (Rust, TOML, YAML)
Trailmarkpip installCode graph construction and security analysis
TensorZeroconfig/tensorzero.tomlLM inference orchestration and prompt management
terp-brainObsidian vaultOperational knowledge graph linking all domains
oline config~/.oline/config.tomlChain, deployment, and node configuration

Adding a new project to the pipeline:

  1. Run trailmark parse on the new repository
  2. Run qmd-bulk-ingest.py --src-dir <repo>/src --collection <name> to index source
  3. Index any documentation into the appropriate QMD collection
  4. Create TensorZero function definitions for LM tasks specific to the project
  5. Document the new collection and its purpose in this ADR's collection table

Consequences

Positive

  • LM contributions are transparent and auditable — every output traces back to a TensorZero episode with known inputs
  • Semantic search across the entire Terp surface eliminates stale-context errors
  • Trailmark security analysis catches architectural issues before they reach review
  • Local-first design means no central trust bottleneck — anyone can reproduce any query
  • The pipeline composes: QMD ↔ Trailmark ↔ TensorZero can be used independently or end-to-end

Negative

  • Local QMD + Trailmark setup requires initial effort per contributor (venv, collections, ingestion)
  • TensorZero prompt template maintenance is ongoing work
  • Federated model means collection schemas must be documented and followed consistently

Neutral / Trade-offs

  • QMD's markdown-only requirement means non-markdown files always need wrapping — this is a deliberate trade-off for index simplicity
  • Trailmark analysis adds latency to the onboarding of new repositories, but pays off in reduced review cycles
  • The inference server layer architecture may evolve as provider hook configurations stabilize

Backwards Compatibility

Fully additive. No existing code, APIs, or on-chain behavior changes. This ADR defines a workflow and toolchain specification — it does not modify terp-core, CosmWasm contracts, or any deployed software. Existing contributors not using LM tooling are unaffected.

Test Cases

  • QMD: verify oline-qmd search 'store migrations' -c abstract returns ADR-041 and related documents
  • QMD: verify oline-qmd collection list returns all documented collections with expected document counts
  • Trailmark: verify trailmark parse on o-line source produces a graph with expected node/edge counts
  • Trailmark: verify blast radius analysis identifies the upgrade handler as a high-centrality node
  • TensorZero: verify a function definition with MiniJinja template renders correctly with variable substitution
  • Integration: verify QMD query results can be passed as context to a TensorZero function and produce a valid LM output
  • Bulk ingest: verify qmd-bulk-ingest.py --dry-run lists expected .rs files without writing to DB

Further Discussions / Open Questions

  • Should QMD collections be version-locked to git commits/tags for full reproducibility?
  • What is the minimum TensorZero episode metadata that must accompany an LM-assisted PR?
  • Should inference server session logs be ingested into QMD for after-action review?
  • How to handle QMD collection schema migrations when the ingestion format changes?
  • Community contribution guidelines: what disclosure is required when using LM tooling?

References

  • ADR-1: Standard Template and Design Guidelines
  • ADR-2: Expected Testing Library Design
  • ~/.oline/scripts/oline-qmd — QMD wrapper script
  • ~/.oline/scripts/qmd-bulk-ingest.py — bulk ingestion script
  • ~/.oline/config.toml — o-line configuration
  • TensorZero documentation: https://www.tensorzero.com/docs
  • Trailmark skill documentation (Hermes skill: trailmark)
  • QMD setup notes: Homebrew Python 3.13 venv at ~/.oline/venv/, global DB at ~/.oline/semantic/qmd-global.db

On this page