Time-Travel Chat: Forking Conversation Trees from a Checkpoint
Table of Contents
- 1. Overview
- 2. What a session actually is
- 3. The conversation is a tree, not a line
- 4. The two hard invariants
- 5. The mutation contract
- 6. What time-travel looks like
- 7. Speculation: a fork protocol
- 8. The REPL as first-class citizen
- 9. Test bed: driving a real session over tmux
- 10. Why this didn't stick
- 11. Open questions
- 12. References
These are working notes. The artifact is the REPL session: every claim below is a
form evaluated against real transcripts, with the output kept as evidence. The
tooling lives in src/wal_sh/site/chat.clj; the exploration ran in the babashka
www-repl. Part of the REPL-driven series — see also
REPL-driven compliance and
REPL-driven flight tracking.
1. Overview
The central idea is time-travel chat. The context a conversation accumulates up to some point is valuable — but from that point you often want to deviate and explore a different continuation: "what if, at turn 40, I had gone the other way." So: jump back to a past checkpoint, keep everything up to it, and branch forward into an alternate conversation. The original line is untouched; you have grown a sibling future.
This is append-only, which makes it the easy case. You never rewrite the past —
you reuse a prefix and add a new branch. That is git's checkout <old-sha> &&
branch, and it inherits git's safety: immutable history, cheap branches via
structural sharing. (Rewriting or deleting the past — a more granular compaction —
is the harder cousin; it gets its own section at the end.)
The intuition is familiar from interactive fiction: Choose Your Own Adventure books, branching screenplays, Netflix's Bandersnatch — go back to a decision point and take the other path. Or, less tidily, getting to replay a conversation with a friend and steer it somewhere new. The catch — and it is the through-line of these notes — is that those formats are notoriously hard to author and hard to follow, which is exactly why this idea resists becoming a usable tool.
A caveat up front, because it shapes the scope. This is unlikely to become a tool people drive by hand: the harness TUI does not expose the tree, and once sub-agents enter the picture the branches are hard to track. So the value is the data structure and the contract, not a product. We treat the REPL as the interface and the spec as the deliverable.
1.1. Framing: rehydration, not rewind
Mechanically this is rehydration / resurrection — the conversational form of the
snapshot-and-restore pattern from the computer-use work, where an agent's world is
serialized and brought back to life to resume. Here the transcript prefix is the
serialized state, the resumption point is a :uuid, and "forking forward"
rehydrates that state into a live session that then goes somewhere new. The same
shape recurs across the site: resurrecting Unix V4 from a saved disk image, and the
reversible pipeline's "load a URL, the state hydrates." The novelty is only the
medium — the state being rehydrated is a context window, and there can be many
live resurrections of the same past, diverging.
2. What a session actually is
First, stop assuming it is a list of messages. Parse one and count types.
(require '[wal-sh.site.chat :as chat] :reload) (def es (chat/read-session (first (chat/list-sessions)))) (frequencies (map :type es))
{"assistant" 2547, "user" 1105, "attachment" 586,
"mode" 368, "permission-mode" 368, "last-prompt" 367, "ai-title" 367,
"queue-operation" 360, "system" 209, "file-history-snapshot" 200}
;; 6477 entries; only user+assistant (3652) are messages.
Most lines are sidecars: UI mode toggles, queue operations, attachments,
file-history snapshots, AI-generated titles. They are out-of-band — they annotate
the session but are not part of the conversation the model sees. First contract
rule: the message corpus is #{user assistant}; everything else is metadata.
3. The conversation is a tree, not a line
Each message carries :uuid (a content id — the "sha" you would fork from),
:parentUuid (the edge), and :isSidechain (sub-agent threads). Rewinds and
edits create siblings, so the shape is a tree — the same shape git uses
(content-addressed nodes + parent pointers).
(chat/tree-shape es) ;; => {:nodes 4447, :roots 3, :leaves 14, :branch-points 11, :sidechains 0} (mapv :type (chat/roots es)) ;; => ["user" "system" "system"]
Eleven forks and fourteen live tips in a single session — the abandoned branches
of every rewind. The conversation root is the lone user root; the two system
roots are sidecar fragments.
3.1. The fragmentation trap
The obvious move — filter to messages, then build the tree — is wrong. It shatters the graph:
(def msgs (filterv #(#{"user" "assistant"} (:type %)) es)) (chat/tree-shape msgs) ;; => {:nodes 3652, :roots 1, :leaves 608, :branch-points 11, :sidechains 0}
608 leaves with only 11 branch-points is impossible for a connected tree. The reason:
(let [muids (set (keep :uuid msgs)) alluids (set (keep :uuid es))] {:msg-parent->msg (count (filter #(muids (:parentUuid %)) msgs)) :msg-parent->sidecar (count (filter #(and (:parentUuid %) (not (muids (:parentUuid %))) (alluids (:parentUuid %))) msgs)) :msg-parent->missing (count (filter #(and (:parentUuid %) (not (alluids (:parentUuid %)))) msgs)) :msg-parent->nil (count (filter #(nil? (:parentUuid %)) msgs))}) ;; => {:msg-parent->msg 3055, :msg-parent->sidecar 596, :msg-parent->missing 0, :msg-parent->nil 1}
596 messages have a parent that is a sidecar, not a message. The parent chain
weaves through non-message entries. So: build the DAG over the full uuid set;
the message sequence is a projection of a root→leaf path. No edge ever dangles
(missing is 0), which is itself a contract invariant worth asserting.
4. The two hard invariants
4.1. Tool calls must pair exactly
The API rejects an assistant tool_use block without a matching user
tool_result, and vice versa. In a complete session the books balance to zero:
(let [blocks (fn [t k] (set (for [e msgs c (get-in e [:message :content]) :when (and (map? c) (= t (:type c)))] (k c)))) uses (blocks "tool_use" :id) results (blocks "tool_result" :tool_use_id)] {:tool_use (count uses) :tool_result (count results) :orphan-uses (count (clojure.set/difference uses results)) :orphan-results (count (clojure.set/difference results uses))}) ;; => {:tool_use 962, :tool_result 962, :orphan-uses 0, :orphan-results 0}
This is the constraint that makes granular mutation hard. You cannot delete the
assistant turn that emitted a tool_use without also deleting the user turn that
carried its tool_result. The atomic unit of mutation is the tool-use/result
pair — and, more conservatively, the turn — not the individual block. Compaction
sidesteps this by always cutting at a turn boundary and replacing the prefix
wholesale. Granular mutation has to honor the pairing explicitly.
4.2. Sub-agents back-reference their caller
Sidechain (sub-agent) results link back to the assistant that spawned them:
(frequencies (map #(boolean (:sourceToolAssistantUUID %)) es)) ;; => {false 5515, true 962} ; exactly the 962 tool results
So a tool result is reachable two ways — by parentUuid and by
sourceToolAssistantUUID. Both must stay consistent under mutation, which is the
concrete form of "sub-agents make it hard to track."
5. The mutation contract
Three complementary expressions, REPL-first.
5.1. clojure.spec — structural validity
Specs describe a well-formed transcript and a valid projected path. They are the precondition and postcondition of any mutation.
(require '[clojure.spec.alpha :as s]) (s/def ::uuid (s/nilable string?)) (s/def ::parentUuid (s/nilable string?)) (s/def ::type #{"user" "assistant" "system" "summary" "attachment" "mode" "queue-operation" "permission-mode" "last-prompt" "ai-title" "file-history-snapshot"}) (s/def ::entry (s/keys :req-un [::type] :opt-un [::uuid ::parentUuid])) ;; corpus invariants --- hold before AND after a mutation (s/def ::no-dangling-parents (fn [es] (let [ids (set (keep :uuid es))] (every? #(or (nil? (:parentUuid %)) (ids (:parentUuid %))) es)))) (s/def ::unique-uuids (fn [es] (apply distinct? (keep :uuid es)))) ;; a projected root→leaf path is API-valid (s/def ::tools-balanced (fn [path] (= (count (tool-use-ids path)) (count (tool-result-ids path))))) (s/def ::roles-alternate (fn [path] (->> path (map :type) dedupe (partition 2 1) (every? (fn [[a b]] (not= a b))))))
5.2. Property-based tests — operations preserve the contract
Specs say what is valid; PBT says the operations keep it valid. Generate a
random tree, apply an operation, assert the invariants. (Run under JVM Clojure
via deps.edn's org.clojure/test.check; bb stays the exploration REPL.)
(require '[clojure.test.check.clojure-test :refer [defspec]] '[clojure.test.check.properties :as prop]) ;; forking off any uuid leaves the parent's ancestry untouched (defspec fork-preserves-ancestry 200 (prop/for-all [t gen-transcript, u gen-existing-uuid] (let [t' (fork t u)] (= (ancestry t u) (ancestry t' u))))) ;; granular drop + reparent: no dangling edges, uuids stay unique, ;; and the projected path stays tool-balanced (defspec drop-span-keeps-contract 200 (prop/for-all [t gen-transcript, span gen-turn-span] ; span = whole turns (let [t' (drop-span t span)] ; children reparented (and (s/valid? ::no-dangling-parents t') (s/valid? ::unique-uuids t') (s/valid? ::tools-balanced (project-current t')))))) ;; compaction is the degenerate case: drop-span where the span is a prefix ;; and the replacement is a single summary node. Same property, narrower input. (defspec compaction-is-a-prefix-drop 100 (prop/for-all [t gen-transcript, k gen-prefix-length] (s/valid? ::tools-balanced (project-current (compact t k)))))
The generators encode the domain: gen-turn-span only ever selects whole
turns (so the tool pairing can be preserved), which is the property test
discovering the atomic-unit rule for itself.
5.3. json-schema — dispatch on version drift
Every entry carries :version. Today it is uniform —
(frequencies (keep :version es)) ;; => {"2.1.161" 4447} (->> (chat/list-sessions) (take 60) (mapcat #(keep :version (chat/read-session %))) frequencies) ;; => {"2.1.161" 5565}
— but the schema will drift across Claude Code releases (fields appear,
summary entries change shape). So the reader should dispatch on :version
against a registry of json-schemas, rather than assume one shape. The :version
field is the dispatch key; a per-version schema validates each line and selects
the right normalizer:
(def schema-registry {"2.1.x" "schemas/claude-jsonl-2.1.json" ;; "2.2.x" "schemas/claude-jsonl-2.2.json" ; when it lands }) (defn normalize [entry] (let [v (:version entry) schema (lookup-schema schema-registry v)] (when-not (valid-against? schema entry) (throw (ex-info "unknown transcript shape" {:version v :entry entry}))) (migrate-to-canonical entry v)))
This keeps the spec/PBT layer working on one canonical shape while json-schema absorbs the version churn at the boundary.
6. What time-travel looks like
The central operation is a single append-only primitive:
fork(t, uuid)— resume fromuuid: keep its ancestry, add a new continuation. Everything up touuidis shared (structural sharing; cf. purely functional data structures and faster, simpler red-black trees); only the new branch is fresh. O(1) — git's cheap branch. The past is immutable — you never touch the original line, you grow a sibling future beside it.
The resumption point is constrained: you may only fork from a uuid whose prefix
is a complete, tool-balanced path — you cannot rehydrate from inside a
tool_use=/=tool_result pair. So the legal fork points are exactly the turn
boundaries, the same atomic unit the pairing analysis forced.
A toy tree — time-travel to B, then go a different way:
A(user) ── B(assistant) ── C(user/tool_result) ── D(assistant) [main, untouched]
└─────────── C'(user) ── D'(assistant) [alt: a different turn 40]
fork point = B.uuid ; ancestry(D') = [D' C' B A] ; the [.. D] line is unchanged
Many such forks off the same prefix = a tree of alternate explorations that all share their history and diverge at chosen checkpoints. That is the whole feature.
6.1. The harder cousin: rewriting the past
Append-only time-travel needs nothing destructive. If you instead want to edit history — a more granular compaction — you need two destructive primitives, and they carry the heavier contract (reparenting, tool-rebalancing) from the contract section above:
drop-span(t, span)— remove whole turns; reparent survivors' children to the nearest surviving ancestor; the projected path must stay tool-balanced.compact(t, k)— the special case: drop a prefix, replace it with onesummarynode anchored byleafUuid. This is what the harness already does, coarsely.
These are strictly harder than time-travel, which is why time-travel is the place to start: it is pure addition over an immutable log.
6.2. Related art: aygp's gemini-repl
aygp-dr's gemini-repl-007 roadmap sketched a "git for conversations" CLI —
related, though its emphasis is fork-and-*rebuild*/compare rather than
reuse-prefix-then-diverge:
gemini-repl fork main-session --name "alternative-approach" gemini-repl diff session1 session2 --format markdown gemini-repl merge session2 --into session1 --strategy insights-only gemini-repl tree my-session --format mermaid # + auto_checkpoint: 5m
fork is exactly the time-travel primitive; tree is the read view from
chat.clj. merge --strategy insights-only is the open research problem — a
CRDT-shaped question (how do two divergent branches recombine without a human
picking lines), and it only matters if branches ever rejoin. For pure time-travel
they need not.
7. Speculation: a fork protocol
A concrete sketch for forking the current live chat, in git/DAG terms:
- Find HEAD. The current chat's HEAD is its live leaf —
S = (current-sha es). - Mint a ref. Generate a fresh random GUID
G. In git this is a new branch name; here it is the fork's id. - Point the ref at the commit. Write a fork node copying
SintoGand marking it:{:uuid G :forkOf S :type "fork" :origin "wal-sh-repl" :created <ts>}. That isgit branch G S— a ref at an existing commit, no content duplicated. - Back-link from HEAD. At node
S(or in a HEAD ref record) add{:forks [G …]}so the fork is discoverable from the point it departs — the reflog/notes entry, "a branch leaves here." - Resume. New turns append with
:parentUuidchaining fromSunder refG— the rehydrated, diverging future.
The mapping is exact: commit = message node (:uuid is the sha), branch ref = the
GUID, HEAD = the current leaf, checkout S && branch G = the fork. Nothing is
copied but a pointer; structural sharing does the rest. This is why it "uses git
semantics" — it is a content-addressed DAG with refs.
7.1. Don't poison the stats
These fork markers are our experiments, not Claude Code's real history. Left in,
they inflate every count — branch-points, leaves, forks. So the analysis
layer excludes anything we minted; the marker is the discriminator
(:type "fork" / :origin "wal-sh-repl"), and chat.clj filters it before any
breakdown:
(defn test-fork? [e] (or (= "fork" (:type e)) (= "wal-sh-repl" (:origin e)))) (chat/breakdown) ; real harness history only (default) (chat/breakdown chat/projects-root true) ; include our experimental forks
Two record sets, kept apart: the history the harness wrote, and the speculative branches we graft on.
8. The REPL as first-class citizen
The whole exploration is reproducible from chat.clj. The scope defaults to the
current repo's transcripts and expands on demand:
(chat/breakdown) ; this repo's chats (chat/breakdown chat/projects-root) ; every project on the host ;; => {:sessions 674, :totals {:nodes 14715 :tool-calls 3523 ;; :branch-points 86 :sidechains 9119} ...}
86 forks and 9,119 sidechains across 674 chats — the tree is already there, in bulk. Reading it is solved; the contract above is what editing it would need.
9. Test bed: driving a real session over tmux
To study real transcripts rather than mocks, the orchestrator drives a Claude Code session in a throwaway private repo and reads its JSONL from here. The loop:
gh repo create <lab> --private— a disposable repo.tmux new-session -d -s lab -c <labdir>, then sendclaude --dangerously-skip-permissions(bypass keeps the drive unblocked; the repo is empty, so the blast radius is nil). One quirk: a shell-startup prompt can eat the first keystroke, so send a clearing newline first.tmux send-keysdrives the conversation;tmux capture-panereads the TUI.- From this repo's REPL, point
chat.cljat the lab's transcript dir and read it:
(def lab "~/.claude/projects/-Users-…-chat-fork-lab") ; the encoded lab path (def es (chat/read-session (first (chat/list-sessions lab)))) (chat/tree-shape es) ;; => {:nodes 11, :roots 1, :leaves 1, :branch-points 0, :sidechains 0} ; purely linear
Two driven sessions, both kept deliberately linear (no esc esc rewind), confirm
the read path end to end: the orchestrator's prompts and the model's replies come
back verbatim through chat.clj, contract trivially satisfied (0 tool-calls →
balanced; one root, one leaf → no forks yet). Forking is the v2 move that turns
branch-points positive.
One driven turn asked the lab to model the running succ=/=add example in Racket
with a contract boundary — the artifact form again:
(define zero 'z) (define (succ n) (cons 's n)) (define (add a b) (cond [(eq? a 'z) b] ; add-zero: 0 + b = b [else (succ (add (cdr a) b))])) ; add-succ: (succ a) + b = succ (a + b) ;; (add (succ (succ zero)) (succ zero)) => '(s s s . z) = succ(succ(succ 0)) ;; a contract-out boundary keeps non-Peano values out of add from outside the module
Contract-as-module-boundary is Racket's signature; the lineage runs through the RacketCon talks catalogued here — RacketCon 2024 (UW) and RacketCon 2022 (Brown). This tmux-plus-live-agent drive is itself a form of REPL-driven development, but the durable interface stays the Clojure REPL reading the transcript.
9.1. v1: an implicit fork that resurrects
The fork is not hypothetical. chat.clj/fork-session writes a new session whose
history is the file prefix up to a chosen :uuid, under a fresh id; the source is
untouched (append-only). Forking the Lodestar session at its codename turn
(8cc36543) produced a 6-node prefix; claude -r <new-id> resumed it — the
harness loaded the forged branch as its own session, rehydrated the pre-fork
context (it recalled the codename "Lodestar"), and continued a new direction (a
second codename, "Sextant", for the branch). Reading both transcripts back:
original path : 19 nodes root .. Lodestar
fork path : 20 nodes root .. Sextant
shared prefix : 6 nodes ending at the fork point 8cc36543
(then 62a7353c on the original, 1d9b708d on the fork)
So you can implicitly fork Claude's history and step back to a prior point: forge
a transcript that ends at the fork, give it a new id, claude -r it. The harness
need not know it is a fork — it loads any valid <id>.jsonl. The two sessions
share the prefix uuids (structural sharing on disk) and diverge after the fork
point: a version DAG, built by the REPL. This is the v1 of the capability —
read, fork, resurrect — with the REPL as the whole interface.
10. Why this didn't stick
The contract holds; the product does not. The data structure is real — 86 forks and 9,119 sidechains sit in the transcript log already, append-only, content-addressed, free. What is missing is a reason for a human to drive it. Below is the case for shelving, from the harness side.
10.1. Forks are a fear, not a feature
Branching is the model developers already avoid. Git branches are the canonical
example: cheap, safe, structurally shared — and still the thing most engineers
misuse, abandon stale, or merge in a panic. A conversation tree inherits every one
of those failure modes and adds none of git's tooling. If git branch needed a
decade of porcelain to become bearable, a hand-driven chat fork starts from zero.
10.2. A tree does not fit a terminal
The harness is a TUI, and a TUI renders a tree poorly. Linear scrollback is the medium. Put eleven fork points and fourteen live tips in a single session — the real numbers from one transcript — and you lose your place on the first jump back. Context-switching between siblings carries the full cost of rereading a divergent prefix, and sidechains multiply the branch count past what anyone holds in working memory. The guile-sage work taught the inverse lesson: the REPL is bearable precisely because it is one line of evaluated state, not a graph the operator must navigate.
10.3. The Bandersnatch trap
The analogy that sells the idea also condemns it. Choose Your Own Adventure, branching screenplays, Bandersnatch — interactive fiction never went mainstream because branching narrative explodes on both ends: authoring cost and consumption cost compound with every node. Netflix stepped back. A branching chat is the same combinatorial trap with no author and a reader who is also the writer.
10.4. The infra is real and the payoff is not
To make it hand-usable you would need ref management, a genuine tree view, context-window diffing between siblings, and resumption-point validation (only tool-balanced turn boundaries are legal fork points). That is a second product bolted onto the harness. The payoff is "what if turn 40 went the other way" — answered more cheaply by starting a new session. The harness exposes none of this, and that is the correct decision. The deliverable is the spec and the REPL. There is no tool here.
10.5. The harness already ships the minimal version
A full tree explorer never shipped, but a partial one is already in the harness.
esc esc rewinds "up a few messages" — time-travel to an earlier checkpoint you
then continue differently. Ctrl+O expands collapsed lines: the partial-
persistence read view. And the boundary shows in the wild — /compact can fail
with Conversation too long … press esc twice to go up a few messages and try
again, which is the partial→full edge surfacing, its suggested fix literally
"travel back and diverge." The implicit UI exists; what is missing is only the
explicit tree — which, per the section above, is the part that does not pay for
itself.
There is even a CLI surface for the primitives. claude -r <uuid> --fork-session
resumes a named session and branches it ("fork instead of mutate") — the shipped
form of fork(t, uuid); claude --session-id $(uuidgen) mints a chosen GUID (the
fork-protocol's "new GUID" step); claude -c / claude -r <id> resume the current
or a named conversation (partial persistence); and /compact plus the PreCompact
hook are the coarse-mutation cousin. The primitives ship; only the tree view does
not. Catalogued in the Claude Code workshop and the cross-CLI
coding-agents review (whose non-Claude flag claims are point-in-time and need
dated re-verification — see annotation systems).
11. Open questions
- Merge semantics.
insights-onlymerge across branches is undefined. The CRDT framing is promising but the "insight" unit is fuzzy. - Summary fidelity. Granular
drop-spanloses information the model had; when must a dropped span leave a summary stub (like compaction) versus vanish? - Sub-agent re-attachment. If a turn that spawned a sub-agent is dropped, what happens to the sidechain — orphan, cascade-delete, or re-root?
- Live vs archival. Mutating
~/.claudeunder a running harness is unsafe; mutations should target a copy, and only a stopped session is fair game.
11.1. Speculation: Forking Conversations as a Persistence Ladder
Speculative — under team review. The sections above rest on REPL evidence; this is a conjecture about where confluence becomes tractable.
11.1.1. Thesis: the harness gets confluence
Every shipped chat-branching feature (ChatGPT, AI Studio "Branch from here", Gemini app rollout, Claude edit-earlier-turn) is full persistence and stops there: fork from any node, parent stays intact, future diverges. None is confluent — there is no merge. The reason is not product timidity; prose has no merge function that preserves anything falsifiable.
The claim: confluence is tractable exactly when the branches emit typed, refutable artifacts instead of prose. That is the difference between a chat UI and a grind harness. The harness is the confluent tier the chat UIs structurally cannot reach.
11.1.2. The persistence ladder
Okasaki (Okasaki 1998) / Driscoll–Sarnak–Sleator–Tarjan (Driscoll et al. 1989) vocabulary, applied to a thread (confluent persistence is the later addition (Fiat and Kaplan 2003); persistence taxonomy per (Demaine 2021); prior art: MIT ADS chat, 2025-08):
| Tier | Read past | Write past (branch) | Join branches | Where it lives |
|---|---|---|---|---|
| Ephemeral | no | no | no | a stateless completion |
| Partial | yes | no | no | scrollback; immutable history |
| Full | yes | yes | no | every fork feature shipping in 2026 |
| Confluent | yes | yes | yes | the harness, under a merge contract |
The fork point is a version node. The carried prefix is path-copying: you do not mutate the shared spine, you copy the path from fork to root and grow a new child. Gemini's "context only up to the fork point" is the path-copy invariant stated in product language.
11.1.3. Mapping the products onto the ladder
| Feature | Tier | Carries | Merge | Hermetic fork? |
|---|---|---|---|---|
| ChatGPT branching | full | prefix | no | yes (text prefix only) |
| AI Studio "Branch from here" | full | prefix + Drive | no | leaks (see R2) |
| Gemini app (20% rollout) | full | prefix | no | leaks (past-chat reference) |
| Claude edit-earlier-turn | full | prefix | no | yes |
All four sit on the same rung. The interesting variable is not the tier but
whether the fork is a pure function of (parent_version, new_prompt).
ChatGPT/Claude forks are (modulo sampling) hermetic; the Gemini family is not, and
the tell is in the implementation: AI Studio branching requires Drive
persistence, and the consumer app separately does implicit past-chat referencing.
A branch that can pull context the prefix does not name is no longer path-copying;
it is a fork with an ambient read of mutable global state — a different algebra.
11.1.4. The version DAG
Chat UIs render everything up to and including v4a/v4b. The dashed edges and M
do not exist for them. They exist here because the leaves are artifacts, not turns.
11.1.5. The harness inversion: merge the evidence, not the conversation
In a chat the leaf is prose; merge(prose_A, prose_B) is undefined, so confluence
is "science fiction" (the MIT chat's own phrase). In a grind harness the leaf is an
artifact under a contract:
- a Lean
example :by …= that compiles, or does not - a
test.checkproperty that holds at N=1000, or shrinks to a counter-example - a registry field
:lean :statuspopulated byexact? - a TLA+ refinement that model-checks, or surfaces a trace
These do have a merge function: union under the refutation predicate.
lemmata.elenchus is the join node. Two branches that each pass their local
contract are merged iff their union violates no global invariant; the refuted
branch is pruned, not reconciled by hand. Confluent persistence is not exotic
here — it is the CPRR cycle with the fork made explicit: Conjecture forks the
version DAG, Proof/Refutation labels each leaf, the surviving leaves join.
The slogan: you never merge the conversation; you merge the evidence. The prose is the partial-persistence shell around a confluent artifact DAG.
11.1.6. Fork as a governed event, not "the agent branched"
A fork with no provenance is anthropomorphism: "the model explored option B." A fork in the harness is a ledger event carrying the governance tuple — the same append-only event shape as petclinic-lab:
{:event :fork
:ts #inst "2026-06-13T..."
:parent "v2" ; version node forked from
:branch "v3b"
:tuple "[aygp-dr:worker-B:coordinator@env(forklab:workspace)]"
:prefix-sha "…" ; hash of carried context; the path-copy witness
:note "alternate refutation path for conjecture F3"}
{:event :merge
:ts #inst "..."
:join "v5"
:inputs ["v4a" "v4b"]
:gate :elenchus
:verdict :survives ; | :refuted
:reviewer "coordinator"}
The branch point = a git worktree. The merge authority = the coordinator. The
concurrency cap (4 in petclinic-lab) is the same cap here: past ~6 parallel
forks, the rebase-after-merge fan-out stops being followable and the join verdict
stops being auditable. Branching chats and multi-agent worktree coordination are
the same structure at two scales: the chat UI exposes full persistence to one
human; the harness exposes confluent persistence to a coordinator holding a merge
contract.
11.1.7. Immutable past = soundness
Every product respects it (Gemini: context only to the fork point, the rest "left
behind"). It is the path-copying invariant, and the harness states it as: the
ledger is append-only, :seed is once-per-workspace-lifetime, no event is edited
in place. The conversational grandfather paradox — rewrite v2 and invalidate
v3..vN — is impossible by construction. Retroactivity is not forbidden by policy;
it is unrepresentable.
11.1.8. Refutation conditions
11.1.8.1. R1 — "confluence is tractable in a harness"
Refuted if the artifact merge function is not total: exhibit two branch leaves
that each satisfy their local contract but whose union violates a global invariant
with no mechanical resolution. Concretely: two Lean proofs that each compile but
against divergent axiom sets, both green, jointly inconsistent. Then elenchus
cannot decide and you are back to manual reconciliation — partial persistence
wearing a confluent mask. Mitigation: pin a single axiom environment as a merge
precondition; divergent-environment branches fail the gate, not the join.
11.1.8.2. R2 — "a chat fork is a pure function of (parent, prompt)"
Refuted for the Gemini family by its own implementation: branching needs Drive-durable state, and the app separately injects implicit past-chat context. If a branch reads context the prefix does not name, the fork is not hermetic and the worktree analogy leaks ambient state. Test: fork the same parent twice with the identical next prompt at different wall-clock times; if the two children diverge on retrieved context alone, the fork is impure. ChatGPT/Claude edit-forks should pass; Gemini app forks may not.
11.1.8.3. R3 — "you merge evidence, not prose"
Refuted if the decisive artifact of a branch is irreducibly prose (a design rationale, a threat-model narrative) with no falsifiable surface. Then the harness has no join function and degrades to full persistence. This bounds the claim: confluence is available only over the falsifiable substrate. The prose layer stays full-persistence forever; do not pretend otherwise.
11.1.9. Open questions (speculative thesis)
- Is
elenchusas a merge gate associative and commutative over the leaf set? If join order changes the verdict, "confluent" is the wrong word and it is really a sequential fold (which petclinic-lab's serial-merge coordinator already assumes — so maybe the honest model is foldable, not confluent, and the DAG is a picture we draw for intuition). - What is the right witness for
:prefix-sha? Hash of the rendered prefix is fragile to whitespace; hash of the message-id chain is stable but does not capture the sampling seed. The hermeticity claim (R2) is only as strong as this witness.
See also: 11.1.5
11.2. A tiny artifact: succ/add under contract
The smallest leaf that obeys both contracts at once — the conversation contract
(a valid transcript) and the artifact contract (a Racket -> contract on the
succ=/=add primitives). It checks that (0+1)+40+(0+1) = 42 no matter how you
group it.
11.2.1. The mock chat (contract-valid, inlinable)
Four nodes: roles alternate, one tool_use balanced by one tool_result, parent
chain unbroken, one root and one leaf.
[{:uuid "U1" :parentUuid nil :type "user" :text "succ/add: is (0+1)+40+(0+1) = 42?"}
{:uuid "A1" :parentUuid "U1" :type "assistant" :tool_use {:id "T" :name "racket" :file "succ-add.rkt"}}
{:uuid "U2" :parentUuid "A1" :type "user" :tool_result {:tool_use_id "T" :text "racket+contracts: ok"}}
{:uuid "A2" :parentUuid "U2" :type "assistant" :text "Yes --- 42, both groupings."}]
;; tool_use ids #{"T"} = tool_result ids → balanced ; root U1 ; leaf A2
11.2.2. The leaf (succ-add.rkt)
#lang racket (define (succ n) (add1 n)) (define (add* m n) (if (zero? m) n (succ (add* (sub1 m) n)))) ; Peano: recurse on m (provide (contract-out [succ (-> exact-nonnegative-integer? exact-nonnegative-integer?)] [add* (-> exact-nonnegative-integer? exact-nonnegative-integer? exact-nonnegative-integer?)])) (require rackunit) (check-equal? (add* 1 40) 41) (check-equal? (add* (add* (add* 0 1) 40) (add* 0 1)) 42) ; ((0+1)+40)+(0+1) (check-equal? (add* (add* (add* 0 1) 40) (add* 0 1)) ; = (1+40)+1 (add* (add* 1 40) 1)) (displayln "racket+contracts: ok")
Verified by the literate flow — mktemp -d, copy this org file in,
org-babel-tangle the block above to succ-add.rkt, run racket succ-add.rkt,
confirm, then discard the temp dir. No code is written under site/; the block is
the source of truth, and the captured result (2026-06-13) is:
racket+contracts: ok
The leaf either runs green or it does not — that pass/fail is the mergeable
surface the prose around it lacks. (Cross-checked: Lean4 proves the same identity
by rfl, Guile computes 42; only the Racket leaf is inlined here.)