CPRR Methodology
Conjecture-Proof-Refutation-Refinement
Table of Contents
1. Introduction
CPRR (Conjecture-Proof-Refutation-Refinement) is a research methodology that treats hypotheses as first-class artifacts with explicit lifecycle management, evidence tracking, and refutation gates.
10. Example Workflow
- Conjecture: "Algorithm X has O(n log n) complexity"
- Evidence: Benchmark data, theoretical analysis
- Refutation attempt: Edge case with O(n²) behavior
- Refinement: "Algorithm X has O(n log n) for sorted input"
- New conjecture: Refined hypothesis
- Proof: Formal verification or exhaustive testing
11. Applications
- Research hypothesis management
- Experimental software development
- Formal methods integration
- Multi-agent coordination
- Knowledge base maintenance
12. Multi-Agent Team Patterns
The four-agent pattern recurs across the literature. Each framework names the roles differently but the structure converges:
| Framework | Role 1 | Role 2 | Role 3 | Role 4 | Year |
|---|---|---|---|---|---|
| CPRR | Conjecture | Proof | Refinement | Refutation | 2026 |
| Debate (Du et al.) | Proposer | Challenger | Judge | Arbiter | 2023 |
| ChatEval | Prosecutor | Defender | Judge | Recorder | 2023 |
| MetaGPT | Architect | Engineer | QA | PM | 2023 |
| CrewAI | Researcher | Writer | Reviewer | Editor | 2024 |
| CAMEL | Instructor | Assistant | Critic | User | 2023 |
| AutoGen | Planner | Coder | Executor | Critic | 2023 |
| DeerFlow | Coordinator | Researcher | Coder | Reporter | 2025 |
The canonical decomposition:
- Architect / Planner — decides what to build
- Builder / Coder — writes the implementation
- Reviewer / Critic — finds problems
- Evaluator / Judge — accepts or rejects
CPRR differs from the others in two ways: the refutation gate can update the spec (the contract is mutable), and the evidence is persisted in git notes rather than ephemeral agent memory.
12.1. Dot-product decomposition
When a spec has N requirements and M implementations (e.g. the compliance harness: 13 requirements × 10 languages), the work is the dot product of two dimensions. The orchestrator parallelizes across languages (columns) but serializes across requirements (rows) because the gate pipeline order is normative.
12.2. Persistence strategies
| Strategy | Framework | Survives compaction? |
|---|---|---|
| Ephemeral memory | AutoGen, CrewAI | No |
| JSON files | MetaGPT | Yes (manual) |
| Database | DeerFlow | Yes |
| git commits | CPRR (proof) | Yes |
| git notes | CPRR (refutation) | Yes |
beads (bd) |
Walsh workflow | Yes |
| aq gossip | Walsh workflow | Advisory only |
| PreCompact hook | Claude Code | Yes |
12.3. Related
- Bot Compliance Orchestrator Pattern — the Walsh-Research 10-language harness
- Agent Memory Systems — how agents persist state across sessions
- REPL-Driven Flight Tracking — the methodology applied to data exploration
- Du et al. Improving Factuality and Reasoning via Debate. arXiv:2305.14325 (2023).
- Chan et al. ChatEval: Towards Better LLM-based Evaluators. arXiv:2308.07201 (2023).
13. Resources
- Lakatos, "Proofs and Refutations" (1976)
- Popper, "The Logic of Scientific Discovery" (1959)
- Du et al., "Improving Factuality and Reasoning in Language Models through Multiagent Debate" (2023)
- Hong et al., "MetaGPT: Meta Programming for Multi-Agent Collaborative Framework" (2023)