Model Router Architecture
Status: Production
Overview
Section titled “Overview”The model router classifies incoming tasks and routes them to the most cost-effective LLM. It enforces a $15/day budget target and auto-downgrades when spend exceeds 80%.
Model Tiers
Section titled “Model Tiers”| Tier | Models | Use Case | Cost (per 1M tokens) |
|---|---|---|---|
| Worker (Haiku) | Claude Haiku 3.5, Ollama qwen2.5:7b | Simple lookups, formatting, file ops | $0.25-1.00 / free |
| Specialist (Sonnet) | Claude Sonnet, Kimi K2.5 | Coding, analysis, research, most tasks | $3.00 / $0.38-0.60 |
| Executive (Opus) | Claude Opus | Creative writing, complex reasoning, strategy | $15.00 |
Kimi K2.5 Optimization
Section titled “Kimi K2.5 Optimization”- Approximately 4–5x cheaper than Sonnet at the time of integration (2026-04-06; verify current pricing)
- 5x slower than Sonnet for many tasks
- Quality competitive for structured/coding tasks; check provider’s published benchmarks for current numbers
- Strong for: analytics, QA pipelines, research observations
- Keep Sonnet for: creative writing, tutoring, nuanced tasks where hallucination risk is higher
Routing Logic
Section titled “Routing Logic”1. Task text analyzed for keywords and complexity signals2. Classified into tier (worker / specialist / executive)3. Budget check against daily spend4. If >80% budget consumed → auto-downgrade one tier5. Route to cheapest available model in tierBudget Tracking
Section titled “Budget Tracking”- State file:
state/budget.json - Daily target: $15
- Nightly summary:
track-costs.shruns at 11 PM, sends Telegram summary - Per-request costs: logged in
routing.logwithestimated_cost_usdfield
Unified Router
Section titled “Unified Router”A higher-level router directs traffic across 3 systems: Agent System tools, OpenClaw agents, and Claude Code sessions.
- Weighted keyword scoring with multi-word bonus
- Health-aware OpenClaw routing (30s cache, demotion when gateway is down)
- Handles retired agent redirection (6 OC agents mapped to AS equivalents)
- 50-case test suite: passes cleanly; routing decisions are sub-second in practice