Smart Model Routing for AI Agents

When I first started building OpenClaw, every task went to Claude Opus. Research? Opus. Writing a commit message? Opus. Checking if a file exists? Opus.

My API bill was astronomical. And most of those tokens were wasted on tasks that a smaller model could handle just as well.

The Realization

I was watching my agent logs one evening and noticed something: 90% of the tasks my orchestrator dispatched were simple. Read a file. Search the web. Format some text. Run a shell command. These don’t need the most powerful model on Earth.

But 10% of the tasks — strategic decisions, complex reasoning, synthesizing research into action plans — genuinely needed top-tier intelligence.

The solution was obvious in hindsight: route each task to the cheapest model that can handle it well.

The Three-Tier Model

Here’s the routing framework we use in OpenClaw:

Tier	Model	Cost	Use Cases
Worker	Haiku	$0.80/1M tokens	File operations, web scraping, data formatting, simple code generation
Specialist	Sonnet	$3/1M tokens	Content writing, complex code, analysis, research synthesis
Orchestrator	Opus	$15/1M tokens	Strategic decisions, multi-step reasoning, result verification, task decomposition

How Routing Works in Practice

The orchestrator (Opus) receives a request and decomposes it into subtasks. Each subtask gets routed to the appropriate tier:

// Simplified routing logic
function routeTask(task: Task): ModelTier {
  if (task.requiresReasoning || task.isStrategic) return 'opus';
  if (task.requiresCreativity || task.isComplex) return 'sonnet';
  return 'haiku'; // Default: cheapest model that works
}

Example: “Research competitors and write a strategy memo”

Haiku → Web search for 5 competitor URLs (worker task)
Haiku → Scrape and extract key data from each URL (worker task)
Sonnet → Synthesize findings into a competitive analysis (specialist task)
Opus → Review the analysis, identify strategic implications, write the memo (orchestrator task)

Result: 80% of tokens go to Haiku ($0.80/1M) instead of Opus ($15/1M).

The Results

After implementing smart model routing in OpenClaw:

Dramatic cost reduction — routing simple tasks to Haiku instead of Opus saves the vast majority of your budget
Faster execution — Haiku responds 5x faster than Opus for simple tasks
No quality loss — strategic outputs are identical because Opus still handles them
Better parallelism — cheap models can run 20 concurrent tasks without budget anxiety

The Key Insight

Most AI applications treat model selection as a one-time configuration choice. “We use GPT-4” or “We use Claude.” That’s like saying “We only hire senior engineers” — expensive, wasteful, and unnecessary for most tasks.

Smart model routing treats model selection as a per-task optimization. Every request gets the right tool for the job.

About the author: JD Davenport builds AI agent systems at OpenClaw. Follow on LinkedIn for updates on building AI agents for business.