Multi-Agent Orchestration: The Compliance Harness Pattern
Table of Contents
1. The pattern
A single spec. N implementations. One access log to audit.
The orchestrator agent decomposes a compliance spec into per-language tasks, delegates to builder subagents, and verifies round-trip via server-side access logs. The spec operator observes from a separate machine.
┌─────────────┐
│ Spec │ wal.sh/research/bots/compliance-spec
│ (org-mode) │ Authoritative. Language-agnostic.
└──────┬──────┘
│
┌─────────┴──────────┐
│ Orchestrator │ claude --dangerously-skip-permissions
│ (mini, macOS) │ on walsh/compliance-harness
└─┬───┬───┬───┬───┬─┘
│ │ │ │ │
┌──────┘ │ │ │ └──────┐
▼ ▼ ▼ ▼ ▼
python go rust java guile
typescript ruby swift clojure elisp
│ │ │ │ │
└──────────┴───┴───┴──────────┘
│
▼
┌─────────────────┐
│ wal.sh VPS │ Access logs
│ (production) │ ?impl={org/repo}/{lang}&sha=&spec=
└────────┬────────┘
│
┌────────┴────────┐
│ Observer │ nexus (FreeBSD)
│ gmake logs │ ao auditor
│ aq gossip │ Validates R13 tags landed
└─────────────────┘
2. CPRR cycle per language
For each language implementation:
- Conjecture: claim the task (
bd update --claim). For REPL languages, open a REPL and explore the HTTP library interactively. - Proof: write the implementation (
<200 lines). Commit immediately: ~feat({lang}): gate pipeline passes. - Refinement: run against production. Verify R13 tag in access log. If rate limiting, robots parsing, or backoff deviates, fix and recommit.
- Refutation: if it fails, document why (
git notes add). If the spec is ambiguous, note it. The spec is the contract; the implementation must conform or the spec must clarify.
3. Dot product decomposition
The work is the dot product of two dimensions:
- Tasks: R1 (UA), R2 (robots), R3 (blocklist), R4 (throttle), R5 (backoff), R13 (tagging)
- Languages: 10-20 implementations
Each cell is independently testable. The orchestrator parallelizes across languages (columns), not across requirements (rows) — because the gate pipeline order is normative and non-commutative.
4. Session recovery
Agents lose context on compaction. Recovery protocol:
cat CLAUDE.md # contract and workflow bd list --status=in_progress # what was being worked on git log --oneline -20 # what's committed git notes list # what was tried/failed gmake matrix # which languages pass aq status # who else is working
The state is in git (commits + notes), beads (tasks), and access logs (R13 tags). No ephemeral state needed.
5. Observer pattern
The observer (nexus) does not participate in implementation. It watches:
aq status— gossip from the orchestratorgmake logs— R13 tags in access logsgit log— progressive commits on the harness repo
The observer can intervene by updating the spec (the contract is mutable). The orchestrator re-fetches the spec before each language and adapts.
6. Gossip protocol (aq)
# Orchestrator announces progress aq announce -c C-COMPLIANCE-HARNESS --claim "python passes" --phase proof aq announce -c C-COMPLIANCE-HARNESS --claim "blocked on rust TLS" --phase conjecture # Observer checks aq status --json | jq '.[] | select(.cid == "C-COMPLIANCE-HARNESS")'
7. When to update the spec
The harness is a spec validator. If 3+ implementations make the same mistake, the spec is ambiguous, not the implementers. The observer updates the spec and the orchestrator re-fetches on the next CPRR cycle.
8. Tools required
| Tool | Purpose | Required? |
|---|---|---|
bd (beads) |
Task tracking across sessions | Yes |
aq |
Gossip between orchestrator and observer | Nice to have |
git notes |
Experiment/failure log without polluting commits | Yes |
gmake |
Orchestration targets (test-all, logs, matrix) |
Yes |
ao (npm) |
Compliance auditor on the observer side | Observer only |
9. Reference
- Walsh-Research Compliance Spec (v1.3)
- REPL-Driven Flight Tracking — the methodology
- walsh/compliance-harness — the implementation