Agent Memory Architectures: JITIR Against the Field
A decomposition exercise: Cloudflare Sessions, MemGPT/Letta, Zep/Graphiti, Mem0, A-MEM, LangMem, and what the CLI coding agents already do
Table of Contents
1. What "memory" names
"Agent memory" is one word for at least four distinct things.
1. Conversation log: durable record of messages and tool calls. 2. Working scratchpad: mutable state the agent rewrites mid-task. 3. Learned facts: extracted assertions, indexed for retrieval. 4. Reference material: large documents loaded on demand.
The four have different write disciplines, different read patterns, and different cost profiles in the context window. Most frameworks that present a single "memory" API are quietly conflating two or more of them. The interesting question is not "which framework has the best memory" but "where does each one draw the seams, and what gets complected as a result?"
The doc below uses Cloudflare's Sessions API as the reference decomposition, characterizes the other systems against it, and then locates JITIR on a separate axis the reactive-memory field has left mostly unbuilt. The companion CLI-coding-agent survey (CLI Coding Agents Q2 2026) covers the same problem one level down, at the harness boundary, and is referenced where the parallels are direct.
2. Reference decomposition: Cloudflare Sessions
Cloudflare's Sessions API decomposes memory into four block types, each characterized by a provider contract. The contract is a set of methods on a JavaScript object; the Session detects which methods exist and synthesizes tools to match.
| Block type | Provider methods | Prompt presence | Generated tool |
|---|---|---|---|
| readonly | get() | full content | (none) |
| writable | get() + set() | full content + budget | set_context |
| searchable | get() + search() + set() | summary count | search_context |
| loadable | get() + load() + set() | metadata listing | load_context / unload |
Three properties define this decomposition:
- The provider is the contract. Capability is structural, not
declared. A provider with a
search()method becomes a searchable block; the search-tool generation is mechanical from that fact. - The system prompt is the schema. Each block declares its type to
the model inline, every turn, via the tags
[readonly],[writable],[searchable],[loadable]. The prompt is the schema declaration the model reads. - Every read is agent-initiated.
search_contextandload_contextare tools the model calls. The substrate (Durable Object, R2, SQLite FTS5) does nothing on its own.
The third property is the one the rest of this document organizes around. It is the design choice that nearly every production memory system has made.
Two further moves in the Sessions API are worth naming because they separate things other systems complect:
- Compaction overlays. Older messages are summarized into overlays stored in a separate table, applied at read time. The original messages are not deleted. Identity (the conversation) is preserved; a derived value (the summary) is layered over it.
- Frozen system prompt.
freezeSystemPrompt+withCachedPromptdecouple the rendered prompt from the underlying state. Aset_contextcall writes through to the provider but does not change the cached rendered prompt until an explicitrefreshSystemPromptat a turn boundary. The write event and the read event are unbraided.
3. The field, decomposed
3.1. MemGPT / Letta – virtual memory for context windows
The conversation log and the working scratchpad are decomposed (recall vs main context). Learned facts and reference material are both placed in archival memory, where the agent must distinguish them by its own conventions. The OS-paging metaphor is the metaphor for the contract, not for the storage shape.
Cost of the metaphor: page-out is voluntary. There is no kernel that swaps an unused page when memory pressure is high. The model has to have the discipline to write to archival memory before the relevant fact leaves the FIFO window. Failure mode is silent forgetting that reads as fluency.
3.2. Zep / Graphiti – bi-temporal knowledge graph
Each edge in the semantic graph carries four timestamps:
t_created, t_expired (transaction time) and t_valid,
t_invalid (event time). New episodes can invalidate existing
edges by setting t_invalid; the old edge is not deleted.
Zep/Graphiti adopts a value-oriented treatment: facts accrue and are invalidated rather than mutated in place.
Cost: entity extraction and edge invalidation are LLM judgments. The graph is the product of an extractor that is itself fallible.
3.3. Mem0 – managed hybrid store with scopes
Scope is reified as a first-class axis (user / session / agent). The four block types from Cloudflare are collapsed into a single store, with the LLM extractor deciding what gets written. What is complected: the extractor's judgment about what is a fact is fused with the store's behavior.
3.4. A-MEM – Zettelkasten with mutating notes
Each interaction becomes an atomic note with bidirectional links generated at insertion time based on semantic similarity. The memory-evolution step is where the design becomes place-oriented: historical notes are mutated. From a value-oriented lens this is exactly the thing to avoid for any setting that needs an audit trail.
3.5. LangMem – semantic, episodic, procedural in LangGraph
The procedural-memory category is what nobody else factors out explicitly: "when summarizing email, the first sentence should name the action item" is a procedural rule, not a semantic fact. LangMem stores it as memory rather than as prompt text.
Cost: tight coupling to LangGraph state machines. Reported p95 search latency makes it impractical for interactive use.
4. CLI coding agents – convention as schema
The CLI-agent comparison (CLI Coding Agents Q2 — Memory and persistence) shows the same decomposition played out one level down, at the harness boundary, where the schema is filenames rather than provider methods:
| Sense | CLI-agent convention |
|---|---|
| Project-level instructions | CLAUDE.md / AGENTS.md / GEMINI.md |
| Per-category long-term memory | ~/.claude/.../memory/ (typed dirs) |
| Loadable reference material | .claude/skills/SKILL.md + AGENTS.md |
| Auto-extracted project context | Kiro: product.md tech.md structure.md |
The schema is convention, not type. Convention scales to
cross-harness portability — SKILL.md is now read by Claude Code,
Copilot CLI, and OpenCode — but it does not give the model an
in-prompt declaration of what the contract is for each file.
5. JITIR – a different cut
JITIR is on an axis the systems above mostly do not occupy. The storage shape is conventional; the contract is what differs.
5.1. Two contracts named
- Reactive contract:
search(intent) -> candidates. An agent or human authors the intent. The substrate responds to queries. - Proactive contract:
observe(context) -> candidates pushed. The substrate authors the candidate set. The agent or human is the reader.
The two contracts are distinguished by who decides to look. Every system above adopts the reactive contract. The cost of the reactive contract is the failure mode it makes invisible: the agent did not search because it did not know there was anything to find.
Prior art: Rhodes, B. "The Wearable Remembrance Agent." Personal Technologies 1.4 (1997). MIT Media Lab.
6. The empty cell
Cloudflare's blocks are characterized by two binary axes: where the content lives in the prompt (always vs on-demand), and who writes it (agent vs code). A third row, substrate-initiated, is empty in every production memory framework:
| Where | Reads pushed to agent | Writes initiated by substrate |
|---|---|---|
| Substrate-initiated | JITIR-shape | (no production system) |
The bottom-right cell — the substrate decides to write — is
approached but occupied by none. Kiro's auto-generated
product.md / tech.md / structure.md is the closest: the
harness observes the repository on first run and writes a derived
understanding. It is one-shot, not continuous.
7. What composes, what complects
| System | Decomposed | Complected |
|---|---|---|
| Cloudflare | all four (typed by block) | writes and reads at render time |
| MemGPT/Letta | log vs scratchpad vs archive | facts and reference in archival |
| Zep/Graphiti | identity vs state (bi-temporal) | semantic and episodic share graph |
| Mem0 | scope (user/session/agent) | extraction fused with storage |
| A-MEM | notes linked by similarity | identity mutates in place |
| LangMem | semantic vs episodic vs procedural | memory ops complected with graph |
| JITIR | trigger separated from store | consumer is Emacs; needs port |
The systems differ less in their storage shape than in which distinctions they hold.
8. Related
- CLI Coding Agents Q2 2026 — section 9 covers memory and persistence
- Cloudflare Agents Week 2026 — Sessions API, Durable Objects as agent substrate
- Agentic Systems 2026 — Seven Concerns framework (memory spans L2–L4)
- Agentic Systems Q4 2024 — MCE pattern, memory-centric architecture
- Qwen3.6 KV Cache Constraint — the binding constraint that makes memory retrieval expensive
9. Anchors
- Cloudflare Agents. Memory. https://developers.cloudflare.com/agents/concepts/memory/
- Packer, C. et al. MemGPT: Towards LLMs as Operating Systems. arXiv:2310.08560 (2023).
- Rasmussen, P. et al. Zep: A Temporal Knowledge Graph Architecture for Agent Memory. arXiv:2501.13956 (2025).
- Xu, W. et al. A-MEM: Agentic Memory for LLM Agents. arXiv:2502.12110 (2025).
- Rhodes, B. The Wearable Remembrance Agent. Personal Technologies 1.4 (1997).