Agentic Systems Research Plan: Q1 2026

Executive Summary
Problem Statement
Prior Work Integration
Architecture
Q1 2026 Deliverables
Research Questions
Success Criteria
Timeline
Related External Research
Appendix: Component Status
Contact

Executive Summary

This plan outlines Q1 2026 research focus on multi-agent coordination infrastructure. Our prior work on REPLs, issue tracking, communication queues, and token economics provides a foundation that addresses gaps in existing frameworks (LangGraph, CrewAI, Microsoft Agent Framework).

Problem Statement

Current multi-agent frameworks focus on orchestration but lack:

Gap	Industry Status	Our Approach
Work decomposition	Manual or ad-hoc	Git-native issue tracking with agent handoffs
Agent communication	Framework-specific	Universal file-based queues
State recovery	Lost on crash/timeout	Checkpoint/restore with git worktrees
Resource allocation	Unlimited or hard caps	Token economy with pooling and rate limits
Cost tracking	External or missing	Integrated per-agent metrics

Prior Work Integration

Issue Tracking and Decomposition

Git-native issue tracking that travels with repositories:

Issues as contracts between agents
Priority-based work distribution
Dependency tracking for blocked work
Completion verification before handoff

Integration: Work items become first-class entities in multi-agent workflows. Agents claim, execute, and close issues with full audit trail.

Agent Communication Queues

File-based queue system for agent-to-agent communication:

JSON request/response protocol
Async processing with status tracking
Works with any terminal coding agent
No vendor lock-in

Integration: Universal adapter layer between heterogeneous agents (Claude Code, Amp, Gemini CLI, Copilot). Simpler than MCP for local multi-agent scenarios.

Session State and Exploration

REPL infrastructure with formal specifications:

Session state persistence
Token usage tracking
TLA+/Alloy specifications for correctness

Integration: Foundation for checkpoint/restore and parallel exploration paths.

Time Travel and Decision Recovery

Git worktree-based exploration branching:

                    ┌─── approach-A (worktree)
                    │
main ───────────────┼─── approach-B (worktree)
                    │
                    └─── approach-C (worktree)

Each path:
- Isolated git worktree
- Own agent sessions
- Checkpoint/restore capability
- Merge findings back to main

Integration: Agents explore alternatives without losing decision provenance. Failed approaches remain accessible for future reference.

Token Economy

Mock economy for coordinating agent resource usage:

Mechanism	Purpose
Earning	Commits, issue resolution → tokens
Spending	LLM inference costs tokens
Pooling	Team operations for expensive models
Rate limiting	Prevent runaway spending

Integration: Unified resource model across all agents. Cost-aware model selection (Claude Opus for complex reasoning, Haiku for simple tasks).

Architecture

┌─────────────────────────────────────────────────────────────────────────────┐
│                         Q1 2026 INTEGRATION ARCHITECTURE                     │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│                          ┌──────────────────────┐                           │
│                          │   ISSUE TRACKER      │                           │
│                          │   (Decomposition)    │                           │
│                          └──────────┬───────────┘                           │
│                                     │ creates work items                    │
│                          ┌──────────▼───────────┐                           │
│                          │   TOKEN EXCHANGE     │                           │
│                          │   (Resource Alloc)   │                           │
│                          └──────────┬───────────┘                           │
│                                     │ funds agent work                      │
│          ┌──────────────────────────┼──────────────────────────┐            │
│          ▼                          ▼                          ▼            │
│  ┌───────────────┐       ┌───────────────┐       ┌───────────────┐          │
│  │ Claude Code   │       │ Amp           │       │ Gemini CLI    │          │
│  │ (Primary)     │       │ (Search)      │       │ (Review)      │          │
│  └───────┬───────┘       └───────┬───────┘       └───────┬───────┘          │
│          │                       │                       │                  │
│          └───────────────────────┼───────────────────────┘                  │
│                                  │ via communication queues                 │
│                          ┌───────▼───────┐                                  │
│                          │ QUEUE SYSTEM  │                                  │
│                          │ (Agent IPC)   │                                  │
│                          └───────┬───────┘                                  │
│                                  │ results + state                          │
│                          ┌───────▼───────┐                                  │
│                          │ CHECKPOINTS   │                                  │
│                          │ (Recovery)    │                                  │
│                          └───────────────┘                                  │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Q1 2026 Deliverables

P0: Critical Path

Deliverable	Description	Dependencies
Issue ↔ Token bridge	Close issue → earn tokens; claim → reserve	Issue tracker, token exchange
Queue adapter	Universal TCA interface via file queues	Queue system
Queue protocol versioning	Semver for JSON schema, backwards compatibility	Queue adapter
Queue concurrency	File locking, atomic writes, conflict resolution	Queue adapter
Queue durability	Append-only logs, checksums, corruption detection	Queue adapter
Failure recovery protocol	Agent crash recovery, issue state rollback	Issue tracker, queues
Structured logging	JSON logs from all components for post-mortem	All
Cost metrics	Per-agent token usage logging	Token exchange

Queue Protocol Versioning

Semantic versioning for queue message schema:

{
  "protocol_version": "1.0.0",
  "id": "req_001",
  "type": "eval",
  "content": "..."
}

Version Bump	When
Major (2.0.0)	Breaking changes to required fields
Minor (1.1.0)	New optional fields, new message types
Patch (1.0.1)	Bug fixes, clarifications

Receivers MUST accept messages with same major version. Unknown fields ignored (forward compatibility).

Queue Concurrency

File-based queues need explicit coordination when multiple agents operate simultaneously:

┌─────────────────────────────────────────────────────────────────────────┐
│                      CONCURRENCY STRATEGY                                │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│  WRITE OPERATIONS                                                       │
│  ────────────────                                                       │
│  1. Atomic write via temp file + rename                                 │
│     write(tmp) → fsync(tmp) → rename(tmp, target)                       │
│                                                                         │
│  2. Lock file for multi-step operations                                 │
│     flock(queue.lock) → read → modify → write → unlock                  │
│                                                                         │
│  3. Unique message IDs (UUID v7 for time-ordering)                      │
│     Enables idempotent processing, dedup on replay                      │
│                                                                         │
│  READ OPERATIONS                                                        │
│  ───────────────                                                        │
│  1. Move-to-processing pattern                                          │
│     requests/ → processing/ → responses/                                │
│                                                                         │
│  2. Claim timeout (default 5min)                                        │
│     processing/ files older than timeout → back to requests/            │
│                                                                         │
│  CONFLICT RESOLUTION                                                    │
│  ───────────────────                                                    │
│  1. First-write-wins for claims (rename fails if exists)                │
│  2. Append-only for logs (no conflicts possible)                        │
│  3. Last-write-wins for status updates (version field)                  │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

Queue Durability

Protect against corruption and enable recovery:

MESSAGE FORMAT (append-only log)
────────────────────────────────
{
  "id": "01ARZ3NDEKTSV4RRFFQ69G5FAV",
  "protocol_version": "1.0.0",
  "timestamp": "2026-01-15T10:30:00Z",
  "checksum": "sha256:a1b2c3...",
  "type": "request",
  "payload": { ... }
}

INTEGRITY CHECKS
────────────────
1. Per-message SHA256 checksum
2. Append-only: messages never modified, only appended
3. Sequence numbers for gap detection
4. Daily rotation with archived checksums

CORRUPTION RECOVERY
───────────────────
1. Detect: checksum mismatch or parse failure
2. Isolate: move corrupt file to quarantine/
3. Rebuild: replay from git history (queues are committed)
4. Alert: notify operator via webhook/email

File structure with durability:

queues/
├── requests/
│   └── 01ARZ3NDEKTSV4RRFFQ69G5FAV.json
├── processing/
├── responses/
├── archive/
│   └── 2026-01-14.jsonl.gz        # Daily rotation
├── quarantine/                     # Corrupt files
├── queue.lock                      # flock target
└── checksums.sha256                # Integrity manifest

Failure Recovery Protocol

What happens when an agent crashes mid-issue:

┌─────────────────────────────────────────────────────────────────────────┐
│                      AGENT FAILURE SCENARIOS                             │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│  1. CRASH DURING CLAIM                                                  │
│     Issue: in_progress, no work done                                    │
│     Recovery: Timeout → auto-release after 30min                        │
│     Tokens: Reserved tokens returned to agent wallet                    │
│                                                                         │
│  2. CRASH DURING WORK                                                   │
│     Issue: in_progress, partial commits exist                           │
│     Recovery: Checkpoint restore OR new agent picks up                  │
│     Tokens: Partial credit based on commits/progress                    │
│                                                                         │
│  3. CRASH DURING HANDOFF                                                │
│     Issue: Queue message sent, not ACKd                                 │
│     Recovery: Message replay from queue (at-least-once delivery)        │
│     Tokens: Deferred until receiving agent ACKs                         │
│                                                                         │
│  4. QUEUE CORRUPTION                                                    │
│     Issue: Messages lost or malformed                                   │
│     Recovery: Rebuild from git history (queues are committed)           │
│     Tokens: Reconcile from ledger                                       │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

Issue state machine with failure transitions:

                    timeout (30min)
         ┌──────────────────────────────┐
         │                              │
         ▼                              │
     ┌───────┐  claim   ┌─────────────┐ │  crash
     │ open  │─────────▶│ in_progress │─┘
     └───────┘          └──────┬──────┘
         ▲                     │
         │     abandon         │ complete
         └─────────────────────┼─────────────┐
                               ▼             │
                          ┌────────┐         │
                          │ closed │◀────────┘
                          └────────┘

Recovery commands:

# Check for stale in_progress issues (>30min)
bd list --status=in_progress --stale

# Force release abandoned issue
bd release <issue-id> --reason="agent timeout"

# Replay failed queue messages
queue replay --failed --since="1h ago"

# Reconcile token ledger with issue history
token-exchange reconcile --dry-run

Failure Mode Catalog

Comprehensive failure scenarios with detection and mitigation:

Failure	Detection	Impact	Mitigation	Recovery Time
Agent timeout	Heartbeat miss >30min	Issue orphaned	Auto-release to pool	<1min
Agent crash	Process exit, no response	Partial work lost	Checkpoint restore	<5min
Queue corruption	Checksum mismatch	Messages lost	Rebuild from git	<10min
Queue deadlock	Circular wait detected	All agents blocked	Forced release oldest	<1min
Git merge conflict	bd sync fails	State divergence	Manual resolution	Variable
Token bankruptcy	Balance < reserve	Agent starved	Emergency pool loan	<1min
Token double-spend	Ledger inconsistency	Economic instability	Reconcile + rollback	<5min
Network partition	Remote unreachable	Sync blocked	Local-only mode	0 (degraded)
Disk full	Write fails	All ops blocked	Alert + cleanup	Variable
Clock skew	Timestamp anomaly	Ordering wrong	NTP sync + reorder	<1min

Severity levels:

CRITICAL  - System unusable, all agents blocked
HIGH      - Major functionality impaired, some agents affected
MEDIUM    - Degraded performance, workarounds available
LOW       - Minor issues, self-healing

Failure	Severity	Auto-Recovery
Agent timeout	MEDIUM	Yes
Agent crash	MEDIUM	Yes (with checkpoint)
Queue corruption	HIGH	Yes (from git)
Queue deadlock	CRITICAL	Yes (forced release)
Git merge conflict	HIGH	No (manual)
Token bankruptcy	MEDIUM	Yes (pool loan)
Token double-spend	CRITICAL	Yes (reconcile)
Network partition	LOW	Yes (local mode)
Disk full	CRITICAL	No (manual)
Clock skew	LOW	Yes (NTP)

Structured Logging

All components emit JSON logs for post-mortem analysis:

{
  "timestamp": "2026-01-15T10:30:45.123Z",
  "level": "INFO",
  "component": "queue",
  "event": "message_claimed",
  "trace_id": "tr_01ARZ3NDEK",
  "span_id": "sp_TSVRRF",
  "agent": "claude",
  "message_id": "01ARZ3NDEKTSV4RRFFQ69G5FAV",
  "duration_ms": 45,
  "metadata": {
    "queue_depth": 3,
    "processing_count": 1
  }
}

Log schema by component:

Component	Key Events	Trace Fields
Queue	claimed, completed, failed, timeout	message_id, agent, duration_ms
Issue tracker	created, claimed, closed, released	issue_id, agent, reason
Token exchange	earned, spent, pooled, rate_limited	agent, amount, balance, model
Checkpoint	saved, restored, pruned	checkpoint_id, size_bytes
Agent	started, heartbeat, stopped, crashed	agent, pid, exit_code

Log levels:

Level	When	Retention
ERROR	Failures requiring attention	90 days
WARN	Anomalies, auto-recovered	30 days
INFO	Normal operations	7 days
DEBUG	Detailed tracing	1 day

Correlation:

trace_id: Spans entire workflow (issue claim → close)
span_id: Individual operation within trace
parent_span_id: Links nested operations

Query examples:

# Find all events for a failed workflow
jq 'select(.trace_id == "tr_01ARZ3NDEK")' logs/*.jsonl

# Agent error rate last hour
jq -s '[.[] | select(.level == "ERROR" and .agent == "claude")] | length' logs/$(date +%Y-%m-%d).jsonl

# Slowest queue operations
jq -s 'sort_by(.duration_ms) | reverse | .[0:10]' logs/*.jsonl

# Token spending by model
jq -s 'group_by(.metadata.model) | map({model: .[0].metadata.model, total: map(.amount) | add})' logs/token-*.jsonl

P1: Core Features

Deliverable	Description	Dependencies
Queue introspection	Debug stalled workflows, message tracing	Queue adapter
Checkpoint protocol	`/checkpoint save/restore` commands	REPL infrastructure
Exploration branching	Parallel worktrees per approach	Git worktrees
Agent handoff	Structured work transfer between agents	Issue tracker, queues

Queue Introspection

Essential for debugging stalled workflows before full dashboard:

# Queue status overview
queue status
# Output:
#   requests/: 3 pending (oldest: 2min ago)
#   processing/: 1 active (agent: claude, claimed: 45s ago)
#   responses/: 12 today
#   errors/: 0

# Trace a specific message through the system
queue trace <message-id>
# Output:
#   01ARZ3... created    2026-01-15T10:30:00Z
#   01ARZ3... claimed    2026-01-15T10:30:02Z  by claude
#   01ARZ3... completed  2026-01-15T10:30:45Z  duration: 43s

# Find stuck messages
queue stuck --threshold=5m
# Output:
#   processing/01ARZ3... claimed 8m ago by amp (likely stalled)

# Watch queue activity in real-time
queue watch
# Output:
#   [10:30:01] ← claude submitted req_001
#   [10:30:02] → amp claimed req_001
#   [10:30:45] ✓ amp completed req_001 (43s)

# Dump queue state for debugging
queue dump --format=json > queue-state.json

P2: Integration

Deliverable	Description	Dependencies
Formal specifications	TLA+/Alloy specs for queue protocol, published	Queue adapter
CLASSic metrics	Cost, latency, accuracy, stability, security	Cost metrics
Multi-agent review	Adversarial pass with different agents	Queue adapter
Dashboard	Real-time economy and work visualization	All above

Formal Specifications

Publish TLA+/Alloy specs alongside implementation:

specs/
├── queue-protocol.tla      # Queue state machine, message ordering
├── token-exchange.tla      # Token invariants, no negative balances
├── agent-handoff.tla       # Issue state transitions, no orphans
├── concurrency.tla         # Lock-free operations, no deadlocks
└── README.md               # How to run TLC model checker

Key properties to verify:

Property	Spec	Tool
Message ordering preserved	queue-protocol.tla	TLC
No token double-spend	token-exchange.tla	TLC
Issues never orphaned	agent-handoff.tla	TLC
Lock-free queue operations	concurrency.tla	TLC
No deadlock in handoff	agent-handoff.als	Alloy

Benefits:

Catches edge cases before implementation
Executable documentation of invariants
Confidence in concurrent operations
Differentiator vs other frameworks

Research Questions

What decomposition granularity optimizes agent effectiveness?
How should token rewards align with actual value delivered?
What checkpoint frequency balances recovery vs overhead?
How do we measure coordination quality (not just individual performance)?
What security model prevents malicious agent behavior in shared queues?

Success Criteria

Metric	Target	Measurement
Agent handoffs	10+ per day	Issue tracker logs
Cost per task	-30% vs baseline	Token exchange ledger
Recovery time	<5 min from checkpoint	Manual testing
Parallel explorations	3+ concurrent	Worktree count
Framework independence	3+ TCA types	Queue adapter compatibility

Timeline

Week	Focus	Deliverables
1-2	Issue ↔ Token bridge	Integration scripts, hook setup
3-4	Queue adapter	Universal TCA interface
5-6	Checkpoint protocol	Save/restore commands
7-8	Exploration branching	Worktree management
9-10	Metrics and dashboard	CLASSic integration
11-12	Documentation and refinement	Team onboarding

Related External Research

Multi-Agent Orchestration

LangGraph: State graphs, supervisor patterns
CrewAI: Role-based teams (Manager/Worker/Researcher)
Microsoft Agent Framework: AutoGen + Semantic Kernel merger

Protocol Standards

MCP (Model Context Protocol): Anthropic's tool integration standard
A2A (Agent-to-Agent): Google's inter-agent protocol
Agentic AI Foundation: Linux Foundation governance

Evaluation

AgentBench: 8 interactive environments
GAIA: 466 real-world tasks
CLASSic: Enterprise dimensions (ICLR 2025)

Economics

ASI Alliance: Fetch.ai + SingularityNET + Ocean ($9.2B)
Agent Exchange (AEX): RTB-inspired auction framework

Appendix: Component Status

Component	Maturity	Repository
Issue tracker	Production	git-native JSONL
Token exchange	Prototype	Guile Scheme (~1100 LOC)
Queue system	Production	File-based JSON
REPL infrastructure	Prototype	ClojureScript
Checkpoint system	Conceptual	Git worktrees

Contact

Questions or suggestions: j@wal.sh