Skip to content
🎓 Find your path Subscribe

Logging & Instrumentation

Tier 1 · Fundamentals 7 min read

When you run one agent on one task, you know what happened — you watched it. When you run dozens of agents across dozens of projects, some on schedules you defined weeks ago, you need a different answer to “what just happened?” The answer is: look at the log files.

Instrumentation isn’t a nice-to-have at scale. It’s the mechanism by which you see the pattern in the noise, catch regressions, and know whether the thing you shipped last Tuesday is still running.


Every meaningful action an agent takes should produce a log entry. Not a system log (/var/log/) — a human-readable changelog entry, timestamped, in plain English.

The convention on this system is a single command:

Terminal window
bash ~/agent-system/scripts/clawd-log.sh <project-slug> "what shipped"

That command appends a timestamped bullet under today’s heading in ~/clawd/projects/<slug>/CHANGELOG.md:

## 2026-06-09
- 13:34 — Shipped fulfillment drainer. Three bugs found and fixed on first live order.
- 09:16 — Store went live for test. MERCH_LIVE_ORDERS=1 set in .env.

Agents run this at the end of every material task. “Material” means anything that changes system state: a script written, a file edited, a migration run, a bug patched, a decision made. Pure research and reading don’t need a log entry.

The CHANGELOG.md is append-only. No line is ever edited after the fact. The log reflects what actually happened — including failures, rethinks, and fixes after the first attempt. If you shipped something broken and then fixed it, the log shows:

- 10:00 — Shipped order_webhook. Tested manually.
- 14:15 — Fixed order_webhook: was using billing address not shipping address. Printful was rejecting all orders.

Both entries stay. The system doesn’t rewrite history to look cleaner. This makes the log honest and makes patterns visible: if you see three “fixed X” entries for the same module, you know that module needs deeper attention.

Per-project changelogs are useful in isolation. But the most valuable view is the aggregate. Every night (and on demand), a rollup script combines every per-project CHANGELOG.md into one master file:

Terminal window
bash ~/agent-system/scripts/daily-rollup.sh --master

The master ~/clawd/CHANGELOG.md is reverse-chronological, grouped by project per day. As of mid-2026 it covers 69 active projects, 1,182 lines, running from April to June 2026. When a new session starts, a 60-second read of the last few days tells you exactly what shipped, what was broken, and what’s in progress — across every project simultaneously.

The failure mode without logging is invisible work. An agent runs for 3 hours, ships 5 changes across 3 files, and reports “done.” The next day, a different session starts. What was done? What files changed? Is there anything still open? Without a log entry, the next session has to reconstruct the state from the filesystem — slow, error-prone, and easy to miss something.

The CLAUDE.md rule on this system is direct: “If I wrote code or edited state, I log it. If I’m unsure which project slug to use, I use the closest match or second-brain-infra as the catch-all.” The point is that skipping the log makes the work invisible to the next session and invisible to JD.

For agents running on schedules (covered in Tier 2), every run should log to a dedicated log file per agent:

~/agent-system/logs/domains/<domain>.log
~/agent-system/logs/cron.log
~/agent-system/logs/mcp-health.log

Each run appends a line: timestamp, status, any errors or anomalies. A stale-cron alerter reads these files and pings Telegram if any critical job hasn’t logged a successful run within 2× its expected interval. That’s how you know a cron silently stopped running — not from observing the absence of its effects, but from the log going stale.

Logging feels like overhead until the moment you actually need it. That moment usually looks like: a scheduled job started failing two days ago; you want to know if it was gradual or sudden; you want to know what changed near that timestamp. With logs, you read the file and know in 30 seconds. Without logs, you search diffs, interview yourself about what you touched, and often never find the root cause.

The discipline is: if you built it and it matters, instrument it. The 90 seconds to write a log entry is trivial compared to the time it saves when something goes wrong.


Next: Cost & Model Routing Fundamentals — local Ollama → cheap cloud → Opus; reading the cost picture; Max quota vs API dollars.