OpenClaw: The First Harness

Tier 3 · Everything Built 8 min read

Before the current system — before the 8 domain agents, the 155–160 cron jobs, the Telegram control plane — there was OpenClaw. This article is the honest autopsy: what it was, what held up, and what forced the pivot.

What OpenClaw Was

OpenClaw was a multi-agent orchestration platform written in TypeScript + Python, roughly 3,500 lines of code. Its spec was documented on 2026-04-05 as a “recreate these patterns in the agent-system” guide — meaning by the time the spec was written, OpenClaw was already being treated as the past system whose good ideas needed carrying forward.

The architecture was hub-and-spoke: one central gateway received all requests, classified them by type and complexity, then routed them to specialized agents. Three were fully implemented:

Research Agent. An iterative knowledge-gap loop built around five subagents:

A gap agent identified what the current knowledge base didn’t cover — or declared the topic complete.
A tool-selector picked the right sources for each gap (Brave, GitHub, Reddit, arXiv, HN, Firecrawl).
Those tools ran in parallel.
An observations agent synthesized the results and updated accumulated knowledge.
A devil’s advocate challenged the findings before the writer agent produced the final markdown report.

The depth settings were configurable: shallow capped at 2 iterations / 5 parallel queries / 2 minutes; standard ran 5 iterations / 10 queries / 15 minutes; deep went to 10 iterations / 15 queries / 30 minutes. Reports were saved to Obsidian with YAML frontmatter for full-text search. This part worked well. It became the direct ancestor of today’s agents/researcher.

Orchestrator Agent. Classify → route → monitor → aggregate. The orchestrator analyzed task complexity, selected a model, and assembled results. It also owned the cost-first LLM routing strategy — local Ollama first, cheap cloud second, Opus only for reasoning-heavy tasks. The claimed cost savings: 75–80% reduction, to roughly $20–35/month. That number is plausible given the routing strategy but not independently verified in the changelog.

Obsidian Memory. Markdown files with YAML frontmatter as the persistence layer. Agents read and wrote to the vault; full-text search + tag filtering made retrieval tractable. The core insight — that persistent state should live in files, not conversation memory — survived the pivot intact and became one of the system’s foundational rules.

The LLM Routing Strategy

OpenClaw’s cost-first router is worth examining because the logic carried forward. The priority order:

1. Local (free)       MacBook Pro: llama3.1:8b, qwen2.5:32b
                      Mac Mini: fallback
2. Cloud (paid)       Simple tasks: cheap fast model
                      Code tasks: DeepSeek-class
                      Reasoning: mid-tier cloud
                      Complex: Claude Opus

The selector checked local availability first, estimated task complexity, and only escalated to cloud if local couldn’t handle it. Fallback chains handled rate limits and failures. This exact structure — local → cheap cloud → Opus — lives today in agents/shared/ollama_client.py and the current model-routing documentation.

What Didn’t Work: The Telegram Bridge

The spec’s own verdict on the Telegram integration: “communication channel integration” was the primary unsolved problem. Permission prompts kept surfacing mid-session. OAuth login challenges interrupted the bot. Sessions were unstable.

The spec floated pivoting to webhooks or Discord. Neither happened. Instead, when the new system was built on Claude Code, the Telegram integration was rebuilt from scratch using a different architecture — a claude process in tmux under launchd, with an inbound router script handling message dispatch. The old OpenClaw bridge was not ported; only the patterns were kept.

What Survived the Pivot

Not everything in OpenClaw was discarded. Three ideas carried through directly:

The research loop

The knowledge-gap iteration pattern became agents/researcher. The same concept: identify gaps, pick tools, run in parallel, challenge findings, write a report. The backends changed (X/Reddit OAuth added 2026-06-08; PullPush killed); the loop architecture didn’t.

Files as memory

Obsidian as a persistence layer gave way to a broader doctrine: state lives in files, not chat. Per-project WORKPLAN/CHANGELOG/LINKS, domain state yamls, the master CHANGELOG — all descendants of OpenClaw’s file-first memory.

Local-first cost routing

The Ollama-first priority order, the cheap-before-Opus escalation chain, the fallback logic — all carried forward. Today’s ollama_client.py routes MBP → Mac Mini → cloud with the same priority philosophy OpenClaw’s TypeScript router used.

The Honest Read

OpenClaw was a real system that did real work. The research loop was genuinely useful; the cost routing was clever. But it was also clearly a first-generation build: ~3,500 lines of custom runtime code that needed to be maintained and extended every time a new capability was needed. The spec itself acknowledged the bridge problem and proposed a wholesale pivot rather than iterating on the existing architecture.

That pivot — to a harness built on top of Claude Code rather than a bespoke runtime — is what the next article covers.

Next: The Pivot to a Claude Code Custom Harness — why OpenClaw’s custom runtime gave way to wrapping Claude Code instead, and what “custom harness” means in concrete terms.