Tracker: CEO Accountability

Tier 3 · Everything Built 6 min read

The CHANGELOG tells you what shipped. Tracker tells you what was promised. The gap between those two things is accountability.

The Problem It Solves

In a high-volume system with many agents and a single operator, commitments accumulate fast. A Telegram message says “I’ll have that draft to you by Friday.” An email says “the migration will run tonight.” A build agent reports “branch ready for review” and then the session rotates and nobody merges it.

The CHANGELOG doesn’t track these. It records what agents logged. But an agent that promised something in a message and then didn’t deliver doesn’t self-report the gap. Tracker is the agent that reads the channels and surfaces the gaps.

What Tracker Does

agents/tracker/ is a Haiku-tier specialist (Tier 3, reports to CEO) that does three things:

Scan. Reads Telegram messages and Gmail outbound emails from the last 7–14 days (configurable). For each message, runs a regex pre-filter that catches commitment-shaped language: “I’ll”, “will have”, “by [date]”, “sending you”, “branch ready”, “deploying tonight.” Messages that don’t match the pre-filter are dropped without an LLM call.

Classify. For messages that pass the pre-filter, call Haiku to extract the commitment:

counterparty — who the commitment was made to
due_at — when (date-parsed, or null if no deadline was stated)
source_quote — the verbatim text
confidence — 0.0–1.0. Items below 0.5 are dropped.

UPSERT and ping. Write accepted commitments to Supabase. Ping Telegram with new items — commitments that weren’t in the last scan. Items already in the ledger don’t re-ping (idempotent). If a counterparty matches a CRM entry, the commitment is cross-linked to the CRM record.

The Two Sources

python3.12 -m agents.tracker --source telegram --lookback-days 7
python3.12 -m agents.tracker --source gmail --lookback-days 14
python3.12 -m agents.tracker --source all

Telegram — inbound router logs. The tracker reads the agent’s own outbound messages (what it said to JD) as well as JD’s statements (what JD committed to others via the agent).

Gmail outbound — GmailOutboundExtractor reads the Sent folder via the gog CLI. Every email the agent sent as JD is a potential commitment. “I’ll follow up next week” in a sent email is a tracked commitment.

The lookback defaults differ because email commitments tend to have longer horizons (14 days vs 7 for chat).

CRM Linking

When a commitment has a recognizable counterparty — a name that matches a CRM slug in ~/clawd/memory/entities/people/ — Tracker links them:

counterparty: "Felix Vivanco" [crm:felix-vivanco]
due: 2026-06-09
quote: "I'll have the Slopes roadmap to you Tuesday at 9am"

This is how commitments become part of the relationship record. The CRM entry for Felix shows not just contact info and interaction history but also what the system committed to him and whether it delivered. The crm_linker.py module handles the matching; it’s fuzzy-matched against CRM slugs and display names.

The `--dry-run` Mode

python3.12 -m agents.tracker --source all --dry-run

Dry-run prints what would be extracted — no Supabase writes, no Telegram pings. Useful for tuning the confidence floor or the lookback window without affecting the live ledger. The scan output shows:

=== Telegram ===
  Messages scanned (lookback window): 847
  Pre-filtered out (regex):           631
  Sent to Haiku for classification:   216
  Accepted as commitments (>= conf):  14

The pre-filter is aggressive by design. 631 of 847 messages got dropped before an LLM call. At Haiku pricing, 216 classifications cost less than a cent. The pre-filter keeps the cost negligible even at high message volumes.

What Tracker Doesn’t Do

Tracker does not:

Mark commitments as fulfilled (that’s the CHANGELOG’s job — when a log entry mentions the same counterparty and subject, the CEO can reconcile manually or via the weekly reaper).
Invent commitments that weren’t in the messages.
Track internal commitments between agents (only external, counterparty-facing commitments).

The --list command shows open commitments:

python3.12 -m agents.tracker --list
python3.12 -m agents.tracker --list --counterparty "Felix Vivanco"

Open means: in the ledger, not yet closed, deadline either in the future or already passed.

Accountability as Infrastructure

The deepest reason for Tracker is that accountability shouldn’t depend on memory. An agent system that makes commitments and never audits whether they were kept will erode the trust of every counterparty it deals with. JD’s name goes on these commitments.

Tracker is the infrastructure that makes “we said we’d do X” visible without requiring anyone to remember it. The weekly reaper (part of the open-loops architecture) pulls Tracker’s open items into the open-loops list. The CEO sees them at the start of each work-queue cycle. Commitments don’t silently expire — they appear in the system’s daily view until they’re closed.

Before this: The Architect Agent — the PRD gate and intent-verify layer for the Nerve Center.

The Group B articles cover the full orchestration layer. From here, Group C covers the memory layer; Group D covers the web cockpit; Groups E–I cover the agents and products built on top.