Smart Model Routing for AI Agents
When I first started building OpenClaw, every task went to Claude Opus. Research? Opus. Writing a commit message? Opus. Checking if a file exists? Opus.
My API bill was astronomical. And most of those tokens were wasted on tasks that a smaller model could handle just as well.
The Realization
Section titled “The Realization”I was watching my agent logs one evening and noticed something: 90% of the tasks my orchestrator dispatched were simple. Read a file. Search the web. Format some text. Run a shell command. These don’t need the most powerful model on Earth.
But 10% of the tasks — strategic decisions, complex reasoning, synthesizing research into action plans — genuinely needed top-tier intelligence.
The solution was obvious in hindsight: route each task to the cheapest model that can handle it well.
The Three-Tier Model
Section titled “The Three-Tier Model”Here’s the routing framework we use in OpenClaw:
| Tier | Model | Cost | Use Cases |
|---|---|---|---|
| Worker | Haiku | $0.80/1M tokens | File operations, web scraping, data formatting, simple code generation |
| Specialist | Sonnet | $3/1M tokens | Content writing, complex code, analysis, research synthesis |
| Orchestrator | Opus | $15/1M tokens | Strategic decisions, multi-step reasoning, result verification, task decomposition |
How Routing Works in Practice
Section titled “How Routing Works in Practice”The orchestrator (Opus) receives a request and decomposes it into subtasks. Each subtask gets routed to the appropriate tier:
// Simplified routing logicfunction routeTask(task: Task): ModelTier { if (task.requiresReasoning || task.isStrategic) return 'opus'; if (task.requiresCreativity || task.isComplex) return 'sonnet'; return 'haiku'; // Default: cheapest model that works}Example: “Research competitors and write a strategy memo”
- Haiku → Web search for 5 competitor URLs (worker task)
- Haiku → Scrape and extract key data from each URL (worker task)
- Sonnet → Synthesize findings into a competitive analysis (specialist task)
- Opus → Review the analysis, identify strategic implications, write the memo (orchestrator task)
Result: 80% of tokens go to Haiku ($0.80/1M) instead of Opus ($15/1M).
The Results
Section titled “The Results”After implementing smart model routing in OpenClaw:
- Dramatic cost reduction — routing simple tasks to Haiku instead of Opus saves the vast majority of your budget
- Faster execution — Haiku responds 5x faster than Opus for simple tasks
- No quality loss — strategic outputs are identical because Opus still handles them
- Better parallelism — cheap models can run 20 concurrent tasks without budget anxiety
The Key Insight
Section titled “The Key Insight”Most AI applications treat model selection as a one-time configuration choice. “We use GPT-4” or “We use Claude.” That’s like saying “We only hire senior engineers” — expensive, wasteful, and unnecessary for most tasks.
Smart model routing treats model selection as a per-task optimization. Every request gets the right tool for the job.
About the author: JD Davenport builds AI agent systems at OpenClaw. Follow on LinkedIn for updates on building AI agents for business.