Agentic Systems Research Plan: Q1 2026

Table of Contents

Executive Summary

This plan outlines Q1 2026 research focus on multi-agent coordination infrastructure. Our prior work on REPLs, issue tracking, communication queues, and token economics provides a foundation that addresses gaps in existing frameworks (LangGraph, CrewAI, Microsoft Agent Framework).

Problem Statement

Current multi-agent frameworks focus on orchestration but lack:

Gap Industry Status Our Approach
Work decomposition Manual or ad-hoc Git-native issue tracking with agent handoffs
Agent communication Framework-specific Universal file-based queues
State recovery Lost on crash/timeout Checkpoint/restore with git worktrees
Resource allocation Unlimited or hard caps Token economy with pooling and rate limits
Cost tracking External or missing Integrated per-agent metrics

Prior Work Integration

Issue Tracking and Decomposition

Git-native issue tracking that travels with repositories:

  • Issues as contracts between agents
  • Priority-based work distribution
  • Dependency tracking for blocked work
  • Completion verification before handoff

Integration: Work items become first-class entities in multi-agent workflows. Agents claim, execute, and close issues with full audit trail.

Agent Communication Queues

File-based queue system for agent-to-agent communication:

  • JSON request/response protocol
  • Async processing with status tracking
  • Works with any terminal coding agent
  • No vendor lock-in

Integration: Universal adapter layer between heterogeneous agents (Claude Code, Amp, Gemini CLI, Copilot). Simpler than MCP for local multi-agent scenarios.

Session State and Exploration

REPL infrastructure with formal specifications:

  • Session state persistence
  • Token usage tracking
  • TLA+/Alloy specifications for correctness

Integration: Foundation for checkpoint/restore and parallel exploration paths.

Time Travel and Decision Recovery

Git worktree-based exploration branching:

                    ┌─── approach-A (worktree)
                    │
main ───────────────┼─── approach-B (worktree)
                    │
                    └─── approach-C (worktree)

Each path:
- Isolated git worktree
- Own agent sessions
- Checkpoint/restore capability
- Merge findings back to main

Integration: Agents explore alternatives without losing decision provenance. Failed approaches remain accessible for future reference.

Token Economy

Mock economy for coordinating agent resource usage:

Mechanism Purpose
Earning Commits, issue resolution → tokens
Spending LLM inference costs tokens
Pooling Team operations for expensive models
Rate limiting Prevent runaway spending

Integration: Unified resource model across all agents. Cost-aware model selection (Claude Opus for complex reasoning, Haiku for simple tasks).

Architecture

┌─────────────────────────────────────────────────────────────────────────────┐
│                         Q1 2026 INTEGRATION ARCHITECTURE                     │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│                          ┌──────────────────────┐                           │
│                          │   ISSUE TRACKER      │                           │
│                          │   (Decomposition)    │                           │
│                          └──────────┬───────────┘                           │
│                                     │ creates work items                    │
│                          ┌──────────▼───────────┐                           │
│                          │   TOKEN EXCHANGE     │                           │
│                          │   (Resource Alloc)   │                           │
│                          └──────────┬───────────┘                           │
│                                     │ funds agent work                      │
│          ┌──────────────────────────┼──────────────────────────┐            │
│          ▼                          ▼                          ▼            │
│  ┌───────────────┐       ┌───────────────┐       ┌───────────────┐          │
│  │ Claude Code   │       │ Amp           │       │ Gemini CLI    │          │
│  │ (Primary)     │       │ (Search)      │       │ (Review)      │          │
│  └───────┬───────┘       └───────┬───────┘       └───────┬───────┘          │
│          │                       │                       │                  │
│          └───────────────────────┼───────────────────────┘                  │
│                                  │ via communication queues                 │
│                          ┌───────▼───────┐                                  │
│                          │ QUEUE SYSTEM  │                                  │
│                          │ (Agent IPC)   │                                  │
│                          └───────┬───────┘                                  │
│                                  │ results + state                          │
│                          ┌───────▼───────┐                                  │
│                          │ CHECKPOINTS   │                                  │
│                          │ (Recovery)    │                                  │
│                          └───────────────┘                                  │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Q1 2026 Deliverables

P0: Critical Path

Deliverable Description Dependencies
Issue ↔ Token bridge Close issue → earn tokens; claim → reserve Issue tracker, token exchange
Queue adapter Universal TCA interface via file queues Queue system
Queue protocol versioning Semver for JSON schema, backwards compatibility Queue adapter
Queue concurrency File locking, atomic writes, conflict resolution Queue adapter
Queue durability Append-only logs, checksums, corruption detection Queue adapter
Failure recovery protocol Agent crash recovery, issue state rollback Issue tracker, queues
Structured logging JSON logs from all components for post-mortem All
Cost metrics Per-agent token usage logging Token exchange

Queue Protocol Versioning

Semantic versioning for queue message schema:

{
  "protocol_version": "1.0.0",
  "id": "req_001",
  "type": "eval",
  "content": "..."
}
Version Bump When
Major (2.0.0) Breaking changes to required fields
Minor (1.1.0) New optional fields, new message types
Patch (1.0.1) Bug fixes, clarifications

Receivers MUST accept messages with same major version. Unknown fields ignored (forward compatibility).

Queue Concurrency

File-based queues need explicit coordination when multiple agents operate simultaneously:

┌─────────────────────────────────────────────────────────────────────────┐
│                      CONCURRENCY STRATEGY                                │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│  WRITE OPERATIONS                                                       │
│  ────────────────                                                       │
│  1. Atomic write via temp file + rename                                 │
│     write(tmp) → fsync(tmp) → rename(tmp, target)                       │
│                                                                         │
│  2. Lock file for multi-step operations                                 │
│     flock(queue.lock) → read → modify → write → unlock                  │
│                                                                         │
│  3. Unique message IDs (UUID v7 for time-ordering)                      │
│     Enables idempotent processing, dedup on replay                      │
│                                                                         │
│  READ OPERATIONS                                                        │
│  ───────────────                                                        │
│  1. Move-to-processing pattern                                          │
│     requests/ → processing/ → responses/                                │
│                                                                         │
│  2. Claim timeout (default 5min)                                        │
│     processing/ files older than timeout → back to requests/            │
│                                                                         │
│  CONFLICT RESOLUTION                                                    │
│  ───────────────────                                                    │
│  1. First-write-wins for claims (rename fails if exists)                │
│  2. Append-only for logs (no conflicts possible)                        │
│  3. Last-write-wins for status updates (version field)                  │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

Queue Durability

Protect against corruption and enable recovery:

MESSAGE FORMAT (append-only log)
────────────────────────────────
{
  "id": "01ARZ3NDEKTSV4RRFFQ69G5FAV",
  "protocol_version": "1.0.0",
  "timestamp": "2026-01-15T10:30:00Z",
  "checksum": "sha256:a1b2c3...",
  "type": "request",
  "payload": { ... }
}

INTEGRITY CHECKS
────────────────
1. Per-message SHA256 checksum
2. Append-only: messages never modified, only appended
3. Sequence numbers for gap detection
4. Daily rotation with archived checksums

CORRUPTION RECOVERY
───────────────────
1. Detect: checksum mismatch or parse failure
2. Isolate: move corrupt file to quarantine/
3. Rebuild: replay from git history (queues are committed)
4. Alert: notify operator via webhook/email

File structure with durability:

queues/
├── requests/
│   └── 01ARZ3NDEKTSV4RRFFQ69G5FAV.json
├── processing/
├── responses/
├── archive/
│   └── 2026-01-14.jsonl.gz        # Daily rotation
├── quarantine/                     # Corrupt files
├── queue.lock                      # flock target
└── checksums.sha256                # Integrity manifest

Failure Recovery Protocol

What happens when an agent crashes mid-issue:

┌─────────────────────────────────────────────────────────────────────────┐
│                      AGENT FAILURE SCENARIOS                             │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│  1. CRASH DURING CLAIM                                                  │
│     Issue: in_progress, no work done                                    │
│     Recovery: Timeout → auto-release after 30min                        │
│     Tokens: Reserved tokens returned to agent wallet                    │
│                                                                         │
│  2. CRASH DURING WORK                                                   │
│     Issue: in_progress, partial commits exist                           │
│     Recovery: Checkpoint restore OR new agent picks up                  │
│     Tokens: Partial credit based on commits/progress                    │
│                                                                         │
│  3. CRASH DURING HANDOFF                                                │
│     Issue: Queue message sent, not ACKd                                 │
│     Recovery: Message replay from queue (at-least-once delivery)        │
│     Tokens: Deferred until receiving agent ACKs                         │
│                                                                         │
│  4. QUEUE CORRUPTION                                                    │
│     Issue: Messages lost or malformed                                   │
│     Recovery: Rebuild from git history (queues are committed)           │
│     Tokens: Reconcile from ledger                                       │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

Issue state machine with failure transitions:

                    timeout (30min)
         ┌──────────────────────────────┐
         │                              │
         ▼                              │
     ┌───────┐  claim   ┌─────────────┐ │  crash
     │ open  │─────────▶│ in_progress │─┘
     └───────┘          └──────┬──────┘
         ▲                     │
         │     abandon         │ complete
         └─────────────────────┼─────────────┐
                               ▼             │
                          ┌────────┐         │
                          │ closed │◀────────┘
                          └────────┘

Recovery commands:

# Check for stale in_progress issues (>30min)
bd list --status=in_progress --stale

# Force release abandoned issue
bd release <issue-id> --reason="agent timeout"

# Replay failed queue messages
queue replay --failed --since="1h ago"

# Reconcile token ledger with issue history
token-exchange reconcile --dry-run

Failure Mode Catalog

Comprehensive failure scenarios with detection and mitigation:

Failure Detection Impact Mitigation Recovery Time
Agent timeout Heartbeat miss >30min Issue orphaned Auto-release to pool <1min
Agent crash Process exit, no response Partial work lost Checkpoint restore <5min
Queue corruption Checksum mismatch Messages lost Rebuild from git <10min
Queue deadlock Circular wait detected All agents blocked Forced release oldest <1min
Git merge conflict bd sync fails State divergence Manual resolution Variable
Token bankruptcy Balance < reserve Agent starved Emergency pool loan <1min
Token double-spend Ledger inconsistency Economic instability Reconcile + rollback <5min
Network partition Remote unreachable Sync blocked Local-only mode 0 (degraded)
Disk full Write fails All ops blocked Alert + cleanup Variable
Clock skew Timestamp anomaly Ordering wrong NTP sync + reorder <1min

Severity levels:

CRITICAL  - System unusable, all agents blocked
HIGH      - Major functionality impaired, some agents affected
MEDIUM    - Degraded performance, workarounds available
LOW       - Minor issues, self-healing
Failure Severity Auto-Recovery
Agent timeout MEDIUM Yes
Agent crash MEDIUM Yes (with checkpoint)
Queue corruption HIGH Yes (from git)
Queue deadlock CRITICAL Yes (forced release)
Git merge conflict HIGH No (manual)
Token bankruptcy MEDIUM Yes (pool loan)
Token double-spend CRITICAL Yes (reconcile)
Network partition LOW Yes (local mode)
Disk full CRITICAL No (manual)
Clock skew LOW Yes (NTP)

Structured Logging

All components emit JSON logs for post-mortem analysis:

{
  "timestamp": "2026-01-15T10:30:45.123Z",
  "level": "INFO",
  "component": "queue",
  "event": "message_claimed",
  "trace_id": "tr_01ARZ3NDEK",
  "span_id": "sp_TSVRRF",
  "agent": "claude",
  "message_id": "01ARZ3NDEKTSV4RRFFQ69G5FAV",
  "duration_ms": 45,
  "metadata": {
    "queue_depth": 3,
    "processing_count": 1
  }
}

Log schema by component:

Component Key Events Trace Fields
Queue claimed, completed, failed, timeout messageid, agent, durationms
Issue tracker created, claimed, closed, released issueid, agent, reason
Token exchange earned, spent, pooled, ratelimited agent, amount, balance, model
Checkpoint saved, restored, pruned checkpointid, sizebytes
Agent started, heartbeat, stopped, crashed agent, pid, exitcode

Log levels:

Level When Retention
ERROR Failures requiring attention 90 days
WARN Anomalies, auto-recovered 30 days
INFO Normal operations 7 days
DEBUG Detailed tracing 1 day

Correlation:

  • trace_id: Spans entire workflow (issue claim → close)
  • span_id: Individual operation within trace
  • parent_span_id: Links nested operations

Query examples:

# Find all events for a failed workflow
jq 'select(.trace_id == "tr_01ARZ3NDEK")' logs/*.jsonl

# Agent error rate last hour
jq -s '[.[] | select(.level == "ERROR" and .agent == "claude")] | length' logs/$(date +%Y-%m-%d).jsonl

# Slowest queue operations
jq -s 'sort_by(.duration_ms) | reverse | .[0:10]' logs/*.jsonl

# Token spending by model
jq -s 'group_by(.metadata.model) | map({model: .[0].metadata.model, total: map(.amount) | add})' logs/token-*.jsonl

P1: Core Features

Deliverable Description Dependencies
Queue introspection Debug stalled workflows, message tracing Queue adapter
Checkpoint protocol /checkpoint save/restore commands REPL infrastructure
Exploration branching Parallel worktrees per approach Git worktrees
Agent handoff Structured work transfer between agents Issue tracker, queues

Queue Introspection

Essential for debugging stalled workflows before full dashboard:

# Queue status overview
queue status
# Output:
#   requests/: 3 pending (oldest: 2min ago)
#   processing/: 1 active (agent: claude, claimed: 45s ago)
#   responses/: 12 today
#   errors/: 0

# Trace a specific message through the system
queue trace <message-id>
# Output:
#   01ARZ3... created    2026-01-15T10:30:00Z
#   01ARZ3... claimed    2026-01-15T10:30:02Z  by claude
#   01ARZ3... completed  2026-01-15T10:30:45Z  duration: 43s

# Find stuck messages
queue stuck --threshold=5m
# Output:
#   processing/01ARZ3... claimed 8m ago by amp (likely stalled)

# Watch queue activity in real-time
queue watch
# Output:
#   [10:30:01] ← claude submitted req_001
#   [10:30:02] → amp claimed req_001
#   [10:30:45] ✓ amp completed req_001 (43s)

# Dump queue state for debugging
queue dump --format=json > queue-state.json

P2: Integration

Deliverable Description Dependencies
Formal specifications TLA+/Alloy specs for queue protocol, published Queue adapter
CLASSic metrics Cost, latency, accuracy, stability, security Cost metrics
Multi-agent review Adversarial pass with different agents Queue adapter
Dashboard Real-time economy and work visualization All above

Formal Specifications

Publish TLA+/Alloy specs alongside implementation:

specs/
├── queue-protocol.tla      # Queue state machine, message ordering
├── token-exchange.tla      # Token invariants, no negative balances
├── agent-handoff.tla       # Issue state transitions, no orphans
├── concurrency.tla         # Lock-free operations, no deadlocks
└── README.md               # How to run TLC model checker

Key properties to verify:

Property Spec Tool
Message ordering preserved queue-protocol.tla TLC
No token double-spend token-exchange.tla TLC
Issues never orphaned agent-handoff.tla TLC
Lock-free queue operations concurrency.tla TLC
No deadlock in handoff agent-handoff.als Alloy

Benefits:

  • Catches edge cases before implementation
  • Executable documentation of invariants
  • Confidence in concurrent operations
  • Differentiator vs other frameworks

Research Questions

  1. What decomposition granularity optimizes agent effectiveness?
  2. How should token rewards align with actual value delivered?
  3. What checkpoint frequency balances recovery vs overhead?
  4. How do we measure coordination quality (not just individual performance)?
  5. What security model prevents malicious agent behavior in shared queues?

Success Criteria

Metric Target Measurement
Agent handoffs 10+ per day Issue tracker logs
Cost per task -30% vs baseline Token exchange ledger
Recovery time <5 min from checkpoint Manual testing
Parallel explorations 3+ concurrent Worktree count
Framework independence 3+ TCA types Queue adapter compatibility

Timeline

Week Focus Deliverables
1-2 Issue ↔ Token bridge Integration scripts, hook setup
3-4 Queue adapter Universal TCA interface
5-6 Checkpoint protocol Save/restore commands
7-8 Exploration branching Worktree management
9-10 Metrics and dashboard CLASSic integration
11-12 Documentation and refinement Team onboarding

Related External Research

Multi-Agent Orchestration

  • LangGraph: State graphs, supervisor patterns
  • CrewAI: Role-based teams (Manager/Worker/Researcher)
  • Microsoft Agent Framework: AutoGen + Semantic Kernel merger

Protocol Standards

  • MCP (Model Context Protocol): Anthropic's tool integration standard
  • A2A (Agent-to-Agent): Google's inter-agent protocol
  • Agentic AI Foundation: Linux Foundation governance

Evaluation

  • AgentBench: 8 interactive environments
  • GAIA: 466 real-world tasks
  • CLASSic: Enterprise dimensions (ICLR 2025)

Economics

  • ASI Alliance: Fetch.ai + SingularityNET + Ocean ($9.2B)
  • Agent Exchange (AEX): RTB-inspired auction framework

Appendix: Component Status

Component Maturity Repository
Issue tracker Production git-native JSONL
Token exchange Prototype Guile Scheme (~1100 LOC)
Queue system Production File-based JSON
REPL infrastructure Prototype ClojureScript
Checkpoint system Conceptual Git worktrees

Contact

Questions or suggestions: j@wal.sh

Author: Jason Walsh

j@wal.sh

Last Updated: 2026-01-04 23:33:52

build: 2026-01-05 22:52 | sha: 95e3ed2