Whyyouragentkeepsforgetting
Working memory, persistent memory, and the gap that quietly breaks every long-running agent.
Two memories, not one
Every agent has two memory systems whether you designed for them or not.
The first is the — short-term, expensive, capped. Everything inside it is fast to recall. Once you exceed the cap, the oldest stuff gets evicted.
The second is whatever you wrote to disk, a database, or a file the agent re-reads on each turn. Long-term, durable, slow to surface.
The gap between them is where forgetting lives.
What goes wrong
When something matters, the model wrote it down nowhere. It happened mid-conversation, the user moved on, and ten turns later the salient detail aged out of the context window. The model now confidently re-litigates a question that was already settled.
This is not a model failure. It is a system failure. The agent had no protocol for asking itself: "is this worth persisting?"
The fix is boring
Give the agent two tools: remember(key, value) and recall(query). Make the system prompt insist on calling remember whenever a user states a preference, a constraint, or a corrective instruction.
The first agent I built that did this was indistinguishable from a bad assistant. The second agent did it well. The difference was a four-line system prompt that said: if the user told you something durable, write it to memory before the next turn.
What I would do differently
I would build the recall side first, not the remember side. It is much easier to over-remember than to fix a recall layer that returns garbage. Without good search over the persisted store, every memory becomes noise.
Memory is not a feature. It is the substrate. Everything else is the agent.
Like this? Get the next post in your inbox — two emails a week, every one shippable.
From prompt to product
The stack I use to ship a working AI app in a single afternoon: Vercel, Supabase, Claude.