Skip to content
Subscribe

Smart Model Routing for AI Agents

When I first started building OpenClaw, every task went to Claude Opus. Research? Opus. Writing a commit message? Opus. Checking if a file exists? Opus.

My API bill was astronomical. And most of those tokens were wasted on tasks that a smaller model could handle just as well.

I was watching my agent logs one evening and noticed something: 90% of the tasks my orchestrator dispatched were simple. Read a file. Search the web. Format some text. Run a shell command. These don’t need the most powerful model on Earth.

But 10% of the tasks — strategic decisions, complex reasoning, synthesizing research into action plans — genuinely needed top-tier intelligence.

The solution was obvious in hindsight: route each task to the cheapest model that can handle it well.

Here’s the routing framework we use in OpenClaw:

TierModelCostUse Cases
WorkerHaiku$0.80/1M tokensFile operations, web scraping, data formatting, simple code generation
SpecialistSonnet$3/1M tokensContent writing, complex code, analysis, research synthesis
OrchestratorOpus$15/1M tokensStrategic decisions, multi-step reasoning, result verification, task decomposition

The orchestrator (Opus) receives a request and decomposes it into subtasks. Each subtask gets routed to the appropriate tier:

// Simplified routing logic
function routeTask(task: Task): ModelTier {
if (task.requiresReasoning || task.isStrategic) return 'opus';
if (task.requiresCreativity || task.isComplex) return 'sonnet';
return 'haiku'; // Default: cheapest model that works
}

Example: “Research competitors and write a strategy memo”

  1. Haiku → Web search for 5 competitor URLs (worker task)
  2. Haiku → Scrape and extract key data from each URL (worker task)
  3. Sonnet → Synthesize findings into a competitive analysis (specialist task)
  4. Opus → Review the analysis, identify strategic implications, write the memo (orchestrator task)

Result: 80% of tokens go to Haiku ($0.80/1M) instead of Opus ($15/1M).

After implementing smart model routing in OpenClaw:

  • Dramatic cost reduction — routing simple tasks to Haiku instead of Opus saves the vast majority of your budget
  • Faster execution — Haiku responds 5x faster than Opus for simple tasks
  • No quality loss — strategic outputs are identical because Opus still handles them
  • Better parallelism — cheap models can run 20 concurrent tasks without budget anxiety

Most AI applications treat model selection as a one-time configuration choice. “We use GPT-4” or “We use Claude.” That’s like saying “We only hire senior engineers” — expensive, wasteful, and unnecessary for most tasks.

Smart model routing treats model selection as a per-task optimization. Every request gets the right tool for the job.


About the author: JD Davenport builds AI agent systems at OpenClaw. Follow on LinkedIn for updates on building AI agents for business.