Supabase as the Shared Backend

Tier 3 · Real Build 8 min read

Supabase is the shared data layer for the entire agent system. Every agent that needs to persist results, every cockpit page that reads live data, and every cron that reports its output writes to or reads from the same Supabase project. As of June 2026, the schema has 200+ tables.

That number is real and earned over two months of shipping. It also created problems that had to be fixed — mostly around performance, reliability, and the accuracy of what the data actually showed.

Why Supabase

The choice to use Supabase was pragmatic: it is a hosted Postgres instance with a REST API, real-time subscriptions, row-level security, and a generous free tier. The agent system needed a backend that:

Multiple agents could write to concurrently without file-locking problems
The cockpit (a Next.js app on Vercel) could query via REST
Could be set up without a separate server

Supabase checks all three. The tradeoff is that you are dependent on an external service, and when that service has problems (as it did in May 2026), your database is unavailable.

The compute-IO incident

On May 26, 2026, the Supabase database went fully unresponsive: DB, REST, and auth all returned UNHEALTHY. REST calls timed out or returned 522 errors. The dashboard hung loading.

The root cause (CHANGELOG 2026-05-26 10:34): the project was on Supabase’s default compute tier and had exhausted its disk IO budget. Supabase had sent a warning email on May 20 — six days earlier — that went unaddressed. When the IO budget ran out, the database was throttled into a failed-to-connect state.

This was initially misdiagnosed as a platform outage. Supabase’s status page showed “All Operational.” The correct diagnosis: it was a resource exhaustion problem on our specific project, not a platform-wide issue.

The fix: upgrade compute to the Small tier (2 CPU / 2 GB / 174 MB/s IO, ~$15/month) via the Supabase Management API. The database recovered in approximately 3 minutes and 20 seconds after the compute resize.

One non-obvious detail: the Supabase Management API does not expose restart, pause, or resume endpoints for paid active projects — those routes return 404 or are blocked. The only lever for compute-based recovery is the PATCH /v1/projects/{ref}/billing/addons endpoint. This is not documented prominently, and it took trial and error to find it.

UPSERT stabilization

Early in the system’s development, several data pipelines used a delete-then-reinsert pattern for updating records. This created orphan-read windows: if a cockpit page queried while the delete had completed but the insert had not, it would show an empty table.

The fix (CHANGELOG 2026-05-31) was to migrate pipeline_deals, calendar_events, and tasks to UPSERT on a stable external_id. Records are created or updated in a single operation; there is never a window where the data is missing.

The external_id design is important: it must be deterministic, derived from the record’s content in a way that is stable across pipeline runs. For calendar events, the event’s calendar system ID works. For tasks, a hash of domain + task title works. For anything where the source data might change, you need a key that identifies the entity, not the specific record version.

Daily backups

A daily Supabase backup runs at 3:30 AM (CHANGELOG 2026-04-16): all 64 tables at the time (now 200+), approximately 2.5 MB raw, compressed to ~654 KB, stored at ~/clawd/_database_backups/YYYY-MM-DD.tar.gz. Retention is 30 days.

The backup exists because the compute-IO incident proved the database can go down. It also proved that the Management API cannot restore from backup without a support ticket. The backup is the recovery path if the database is corrupted, not just throttled.

Token-spend tracking

Before May 2026, the cockpit’s cost HUD showed $0.00. That was a lie. Real tracking (CHANGELOG 2026-05-31) replaced it with actual numbers: approximately $186/day, approximately $2k/week in token spend.

The token_spend table records spending per agent, per session, per day. The data comes from claude -p --output-format stream-json cost fields, aggregated and written by event_ingest.py on each run.

Why does this matter? Because you cannot optimize what you cannot see. The $0.00 display was actively misleading — it trained the operator to not think about costs. Once the real numbers were visible, model routing decisions became sharper.

Nine orphan tables that had accumulated with no live data writer were also dropped in the same cleanup pass.

The migration workflow

Early in the build, Supabase DDL (schema migrations) was being punted to JD to run manually. This was slow and created bottlenecks. The fix: scripts/supabase-apply-migration.sh — agents can now apply migrations via the Supabase Management API themselves, without requiring JD to take a browser action.

bash ~/agent-system/scripts/supabase-apply-migration.sh \
  migrations/20260606_add_body_battery.sql

The script applies the migration and confirms the schema change. This is load-bearing for the ship-to-prod workflow: an agent that builds a feature and then has to wait for a human to apply the database migration before the feature works is not fully autonomous.

Key tables by category

The 200+ tables span every domain. A rough map:

Category	Example tables
Agent ops	`agent_run_events`, `agent_runs`, `token_spend`, `scrape_cache`
Health	`health_metrics`, `weight_log`, `workout_log`
Communications	`chat_threads`, `chat_messages`, `chat_sessions`, `notification_field`
CRM	`crm_contacts`
Projects/tasks	`tasks`, `pipeline_deals`, `calendar_events`
Merch/commerce	`designs`, `catalog_items`, `stripe_events`
School	`canvas_submissions`, `grade_alerts`
Memory	episodic store, open loops

The schema reflects the system’s scope: 8 domains, ~31 agent packages, and two months of continuous shipping. It is a real schema, not a demo.

Next: Group I covers building in public. OpenBudget: Open-Sourcing the Family Budget App is the first of three articles on what JD ships to the public and why.