LLM agents have no memory between conversations. Each session starts with a blank context window. Within a session, context compaction discards detail to stay within token limits. For agents that serve a persistent role — an advisor, a project manager, a curator — this amnesia is a fundamental limitation.
The naive solution is to load all previous session transcripts into context. This fails for the same reason context compaction exists: too much material degrades output quality. The agent needs selective, structured access to its own history.
We implemented a three-layer memory system for the PCE's Consul (persistent advisor agent), stored in the workspace filestore where it persists across all conversations.
A single, curated document (~100 lines) containing open threads, recent decisions, and watch items. Read at the start of every session. Updated at the end of every session — resolved items removed, new items added. This is the agent's working memory: always current, always compact.
One file per session, written at session end. Contains what happened, decisions made, and open threads. Read selectively by scanning titles — never bulk-loaded. This is the audit trail: append-only, complete but not always relevant.
The permanent institutional memory: observations, bug reports, feature requests, design notes, incident reports. These live in their canonical locations within the workspace, not in the agent's private memory space. Session summaries link to them but don't duplicate content.
Current state is lossy by design. When an item is resolved, it's removed. The document stays small and relevant. If you need the full history, the session summaries preserve it.
Session summaries are append-only. Easy to write (the agent knows what happened), easy to maintain (never edited after creation). No curation burden.
Workspace documents are the real memory. The agent's private layers are navigation aids. The observations, decisions, and designs that emerge from sessions are filed in public locations where they benefit the whole organisation, not just the agent that wrote them.
Per-agent memory is structurally analogous to the "scratchpad" studied in alignment faking research (Greenblatt et al., 2024). A persistent private space where an agent accumulates state across interactions raises questions about compounding errors and divergence from reality.
Our mitigations:
This is a live experiment. Open questions:
The three-layer system described above is the Consul's private memory. But it sits within a broader hierarchy that spans the entire agent organisation:
The most ephemeral layer. What the agent can see right now, in this conversation. Lost entirely when the session ends or the context compacts. Every agent has this — it is the baseline.
Per-agent, cross-session memory. The agents/{role}/current-state.md document and session summaries. Initially implemented only for the Consul, but being extended to other agents:
agents/consul/current-state.md recording the last curation sweep and commit hash. This is the first time the current-state convention has been used by an agent other than the Consul. The natural next step is for the Curator to have its own agents/curator/current-state.md tracking what it has and hasn't curated.This layer is private to the agent but readable by the workspace owner (transparency by design).
The document store itself — observations, design notes, bug reports, plans, feature requests, indexes, practice notes. This is the memory that belongs to the organisation, not to any individual agent. All agents can read it (subject to visibility tiers). When an agent learns something important, it should be filed here, not kept in private notes.
The raw record of what actually happened — every tool call, every agent response, every task artifact. Stored in transcripts/tasks/ and tasks/. Not read routinely, but available for forensic review, debugging, and research. This is the organisation's complete memory — the equivalent of meeting minutes, email archives, and audit logs.
The hierarchy flows from ephemeral to permanent, from private to shared. Knowledge should migrate upward: something discovered in a context window (Level 0) should be captured in agent notes (Level 1) if it's useful for that agent across sessions, and filed in the workspace (Level 2) if it's useful for the organisation. The transcripts (Level 3) are the safety net — everything is recorded even if nobody thought to file it properly.
This is organisational knowledge management applied to AI agents. The same principles apply to human organisations: personal notes → team wikis → institutional documentation → archives. The difference is that agent memory is perfectly inspectable, which creates opportunities for oversight that human organisations don't have.