CPRR Methodology
Conjecture-Proof-Refutation-Refinement

1. Introduction
2. Core Workflow
3. State Machine
4. Conjecture Structure
5. Evidence Types
6. Integration with Issue Tracking
7. Refutation Gates
8. CPRR vs Traditional Approaches
9. Tooling Requirements
10. Example Workflow
11. Applications
12. Multi-Agent Team Patterns
13. Resources

1. Introduction

CPRR (Conjecture-Proof-Refutation-Refinement) is a research methodology that treats hypotheses as first-class artifacts with explicit lifecycle management, evidence tracking, and refutation gates.

6. Integration with Issue Tracking

8. CPRR vs Traditional Approaches

10. Example Workflow

Conjecture: "Algorithm X has O(n log n) complexity"
Evidence: Benchmark data, theoretical analysis
Refutation attempt: Edge case with O(n²) behavior
Refinement: "Algorithm X has O(n log n) for sorted input"
New conjecture: Refined hypothesis
Proof: Formal verification or exhaustive testing

11. Applications

Research hypothesis management
Experimental software development
Formal methods integration
Multi-agent coordination
Knowledge base maintenance

12. Multi-Agent Team Patterns

The four-agent pattern recurs across the literature. Each framework names the roles differently but the structure converges:

Framework	Role 1	Role 2	Role 3	Role 4	Year
CPRR	Conjecture	Proof	Refinement	Refutation	2026
Debate (Du et al.)	Proposer	Challenger	Judge	Arbiter	2023
ChatEval	Prosecutor	Defender	Judge	Recorder	2023
MetaGPT	Architect	Engineer	QA	PM	2023
CrewAI	Researcher	Writer	Reviewer	Editor	2024
CAMEL	Instructor	Assistant	Critic	User	2023
AutoGen	Planner	Coder	Executor	Critic	2023
DeerFlow	Coordinator	Researcher	Coder	Reporter	2025

The canonical decomposition:

Architect / Planner – decides what to build
Builder / Coder – writes the implementation
Reviewer / Critic – finds problems
Evaluator / Judge – accepts or rejects

CPRR differs from the others in two ways: the refutation gate can update the spec (the contract is mutable), and the evidence is persisted in git notes rather than ephemeral agent memory.

12.1. Dot-product decomposition

When a spec has N requirements and M implementations (e.g. the compliance harness: 13 requirements × 10 languages), the work is the dot product of two dimensions. The orchestrator parallelizes across languages (columns) but serializes across requirements (rows) because the gate pipeline order is normative.

12.2. Persistence strategies

Strategy	Framework	Survives compaction?
Ephemeral memory	AutoGen, CrewAI	No
JSON files	MetaGPT	Yes (manual)
Database	DeerFlow	Yes
git commits	CPRR (proof)	Yes
git notes	CPRR (refutation)	Yes
beads (`bd`)	Walsh workflow	Yes
aq gossip	Walsh workflow	Advisory only
PreCompact hook	Claude Code	Yes

12.3. Related

Bot Compliance Orchestrator Pattern – the Walsh-Research 10-language harness
Agent Memory Systems – how agents persist state across sessions
REPL-Driven Flight Tracking – the methodology applied to data exploration
Du et al. Improving Factuality and Reasoning via Debate. arXiv:2305.14325 (2023).
Chan et al. ChatEval: Towards Better LLM-based Evaluators. arXiv:2308.07201 (2023).

13. Resources

Lakatos, "Proofs and Refutations" (1976)
Popper, "The Logic of Scientific Discovery" (1959)
Du et al., "Improving Factuality and Reasoning in Language Models through Multiagent Debate" (2023)
Hong et al., "MetaGPT: Meta Programming for Multi-Agent Collaborative Framework" (2023)

CPRR Methodology Conjecture-Proof-Refutation-Refinement

Table of Contents

1. Introduction

2. Core Workflow

3. State Machine

4. Conjecture Structure

5. Evidence Types