Skip to content
🎓 Find your path Subscribe

Model Router Architecture

Status: Production

The model router classifies incoming tasks and routes them to the most cost-effective LLM. It enforces a $15/day budget target and auto-downgrades when spend exceeds 80%.

TierModelsUse CaseCost (per 1M tokens)
Worker (Haiku)Claude Haiku 3.5, Ollama qwen2.5:7bSimple lookups, formatting, file ops$0.25-1.00 / free
Specialist (Sonnet)Claude Sonnet, Kimi K2.5Coding, analysis, research, most tasks$3.00 / $0.38-0.60
Executive (Opus)Claude OpusCreative writing, complex reasoning, strategy$15.00
  • Approximately 4–5x cheaper than Sonnet at the time of integration (2026-04-06; verify current pricing)
  • 5x slower than Sonnet for many tasks
  • Quality competitive for structured/coding tasks; check provider’s published benchmarks for current numbers
  • Strong for: analytics, QA pipelines, research observations
  • Keep Sonnet for: creative writing, tutoring, nuanced tasks where hallucination risk is higher
1. Task text analyzed for keywords and complexity signals
2. Classified into tier (worker / specialist / executive)
3. Budget check against daily spend
4. If >80% budget consumed → auto-downgrade one tier
5. Route to cheapest available model in tier
  • State file: state/budget.json
  • Daily target: $15
  • Nightly summary: track-costs.sh runs at 11 PM, sends Telegram summary
  • Per-request costs: logged in routing.log with estimated_cost_usd field

A higher-level router directs traffic across 3 systems: Agent System tools, OpenClaw agents, and Claude Code sessions.

  • Weighted keyword scoring with multi-word bonus
  • Health-aware OpenClaw routing (30s cache, demotion when gateway is down)
  • Handles retired agent redirection (6 OC agents mapped to AS equivalents)
  • 50-case test suite: passes cleanly; routing decisions are sub-second in practice