Skip to content
Subscribe

How AI Agents Remember: Building Persistent Memory

Every AI agent wakes up with amnesia. No context from yesterday. No memory of what worked or failed. Unless you build a memory system.

This is the single biggest gap between “demo agent” and “production agent.” A demo responds to prompts. A production agent remembers.

LLMs have no persistent state. Each API call starts fresh. Your agent forgets:

  • What it learned about your preferences
  • Tasks it completed yesterday
  • Mistakes it already made (and will make again)
  • Context from ongoing projects

The naive solution — stuff everything into the system prompt — doesn’t scale. Context windows are finite and expensive. You need architecture.

After months of iteration with OpenClaw, we settled on a three-layer model that balances completeness with efficiency:

┌─────────────────────────────────┐
│ Layer 3: Curated Summaries │ ← In context window
│ MEMORY.md, entity summaries │
├─────────────────────────────────┤
│ Layer 2: Atomic Facts │ ← Queryable via search
│ entities/*/items.json │
├─────────────────────────────────┤
│ Layer 1: Raw Logs │ ← Complete record
│ memory/YYYY-MM-DD.md, JSONL │
└─────────────────────────────────┘

Every interaction gets logged. Daily markdown files (memory/2026-03-21.md) capture decisions, context, and outcomes. Session JSONL files preserve the full conversation.

This is your ground truth. You never delete raw logs. They’re cheap to store and invaluable for debugging.

# 2026-03-21
## 14:30 - Deployed Agent Tree Hub
- Fixed Vercel auth issue (was preview, not production)
- Added 6 new content pages
- Total pages: 13
## 16:00 - Budget Review
- Sam approved new hosting budget ($20/mo)
- Moved from Vercel Pro to Cloudflare (free tier)

Raw logs are great for humans but terrible for retrieval. Layer 2 extracts structured, queryable facts:

{
"id": "fact-2026-03-21-001",
"subject": "agent-tree-hub",
"predicate": "deployed_to",
"object": "cloudflare-pages",
"timestamp": "2026-03-21T14:30:00Z",
"source": "memory/2026-03-21.md",
"superseded": false
}

Key design decisions:

  • Subject-predicate-object triples for consistent structure
  • Source linking back to raw logs for verification
  • Supersession chains instead of deletion
  • Entity grouping — facts organized under people, companies, projects

This is what actually goes into the context window. A MEMORY.md file with 3-5 sentence summaries per topic. Entity summaries that capture the current state.

MEMORY.md
## Agent Tree Hub
Documentation site for AI agent frameworks. Deployed on Cloudflare Pages.
13 pages covering articles, tutorials, and frameworks. OG image configured
for social sharing.
## Family Budget
Monthly budget tracked in Google Sheets. Sam manages groceries (~$400/mo).
Student income + side projects. Target: $2,500/mo total spend.

The summaries are rewritten weekly from active Layer 2 facts. This keeps the context window small while maintaining accuracy.

When your agent needs to remember something:

  1. Check summaries first — already in context, zero latency
  2. Semantic search — query Layer 2 facts for specific details
  3. Direct file read — if search points to a specific log, read it
  4. Transcript fallback — check session JSONL for recent conversations

This cascading approach keeps 90% of lookups fast (Layer 3) while maintaining access to the full history when needed.

This sounds obvious but it’s the #1 mistake in agent development. Developers assume the agent will “just remember” context from earlier in the conversation. It won’t — especially after context compaction or session restart.

The rule is simple: text > brain. Every decision, every lesson, every preference gets written to the appropriate memory layer.

Memory operations aren’t free. Every read burns input tokens. Every write takes time. Match the operation to the model:

OperationModelWhy
Memory searchHaikuFast, cheap pattern matching
Fact extractionSonnetNeeds nuance to identify what matters
Summary writingSonnetQuality matters for context window
Memory architectureOpusStrategic decisions about what to track

With proper memory, your agent can:

  • Learn preferences over time without being retold
  • Track project state across days and weeks
  • Avoid repeating mistakes by referencing past failures
  • Build relationships — remembering details about people
  • Compound knowledge — each day builds on the last

The difference between a stateless chatbot and a persistent agent isn’t intelligence — it’s memory. Build the memory system first, then everything else becomes possible.


About the author: JD Davenport builds AI agent systems at OpenClaw. Follow on LinkedIn for updates on building AI agents for business.