How AI Agents Remember: Building Persistent Memory
Every AI agent wakes up with amnesia. No context from yesterday. No memory of what worked or failed. Unless you build a memory system.
This is the single biggest gap between βdemo agentβ and βproduction agent.β A demo responds to prompts. A production agent remembers.
The Problem: Stateless by Default
Section titled βThe Problem: Stateless by DefaultβLLMs have no persistent state. Each API call starts fresh. Your agent forgets:
- What it learned about your preferences
- Tasks it completed yesterday
- Mistakes it already made (and will make again)
- Context from ongoing projects
The naive solution β stuff everything into the system prompt β doesnβt scale. Context windows are finite and expensive. You need architecture.
The Three-Layer Memory Model
Section titled βThe Three-Layer Memory ModelβAfter months of iteration with OpenClaw, we settled on a three-layer model that balances completeness with efficiency:
ββββββββββββββββββββββββββββββββββββ Layer 3: Curated Summaries β β In context windowβ MEMORY.md, entity summaries ββββββββββββββββββββββββββββββββββββ€β Layer 2: Atomic Facts β β Queryable via searchβ entities/*/items.json ββββββββββββββββββββββββββββββββββββ€β Layer 1: Raw Logs β β Complete recordβ memory/YYYY-MM-DD.md, JSONL ββββββββββββββββββββββββββββββββββββLayer 1: Raw Logs
Section titled βLayer 1: Raw LogsβEvery interaction gets logged. Daily markdown files (memory/2026-03-21.md) capture decisions, context, and outcomes. Session JSONL files preserve the full conversation.
This is your ground truth. You never delete raw logs. Theyβre cheap to store and invaluable for debugging.
# 2026-03-21
## 14:30 - Deployed Agent Tree Hub- Fixed Vercel auth issue (was preview, not production)- Added 6 new content pages- Total pages: 13
## 16:00 - Budget Review- Sam approved new hosting budget ($20/mo)- Moved from Vercel Pro to Cloudflare (free tier)Layer 2: Atomic Facts
Section titled βLayer 2: Atomic FactsβRaw logs are great for humans but terrible for retrieval. Layer 2 extracts structured, queryable facts:
{ "id": "fact-2026-03-21-001", "subject": "agent-tree-hub", "predicate": "deployed_to", "object": "cloudflare-pages", "timestamp": "2026-03-21T14:30:00Z", "source": "memory/2026-03-21.md", "superseded": false}Key design decisions:
- Subject-predicate-object triples for consistent structure
- Source linking back to raw logs for verification
- Supersession chains instead of deletion
- Entity grouping β facts organized under people, companies, projects
Layer 3: Curated Summaries
Section titled βLayer 3: Curated SummariesβThis is what actually goes into the context window. A MEMORY.md file with 3-5 sentence summaries per topic. Entity summaries that capture the current state.
## Agent Tree HubDocumentation site for AI agent frameworks. Deployed on Cloudflare Pages.13 pages covering articles, tutorials, and frameworks. OG image configuredfor social sharing.
## Family BudgetMonthly budget tracked in Google Sheets. Sam manages groceries (~$400/mo).Student income + side projects. Target: $2,500/mo total spend.The summaries are rewritten weekly from active Layer 2 facts. This keeps the context window small while maintaining accuracy.
Retrieval Strategy
Section titled βRetrieval StrategyβWhen your agent needs to remember something:
- Check summaries first β already in context, zero latency
- Semantic search β query Layer 2 facts for specific details
- Direct file read β if search points to a specific log, read it
- Transcript fallback β check session JSONL for recent conversations
This cascading approach keeps 90% of lookups fast (Layer 3) while maintaining access to the full history when needed.
The βWrite It Downβ Rule
Section titled βThe βWrite It Downβ RuleβThis sounds obvious but itβs the #1 mistake in agent development. Developers assume the agent will βjust rememberβ context from earlier in the conversation. It wonβt β especially after context compaction or session restart.
The rule is simple: text > brain. Every decision, every lesson, every preference gets written to the appropriate memory layer.
Memory operations arenβt free. Every read burns input tokens. Every write takes time. Match the operation to the model:
| Operation | Model | Why |
|---|---|---|
| Memory search | Haiku | Fast, cheap pattern matching |
| Fact extraction | Sonnet | Needs nuance to identify what matters |
| Summary writing | Sonnet | Quality matters for context window |
| Memory architecture | Opus | Strategic decisions about what to track |
What This Enables
Section titled βWhat This EnablesβWith proper memory, your agent can:
- Learn preferences over time without being retold
- Track project state across days and weeks
- Avoid repeating mistakes by referencing past failures
- Build relationships β remembering details about people
- Compound knowledge β each day builds on the last
The difference between a stateless chatbot and a persistent agent isnβt intelligence β itβs memory. Build the memory system first, then everything else becomes possible.
About the author: JD Davenport builds AI agent systems at OpenClaw. Follow on LinkedIn for updates on building AI agents for business.