Multi-Agent Orchestration: CEO Pattern

One agent can do a lot. But one agent doing everything — writing code, researching competitors, drafting emails, deploying servers — is a disaster waiting to happen. Context fills up, mistakes compound, and the agent loses the thread.

The CEO Pattern solves this. Your main agent stops doing work and starts coordinating work. Specialists execute. The CEO verifies.

The Core Insight

A CEO doesn’t write the company’s code. They define what needs to be built, assign it to the right person, review the result, and decide whether it’s good enough. Your orchestrator agent should operate the same way.

CEO Agent (Opus)
├── Defines the task with clear scope
├── Picks the right specialist model
├── Spawns the sub-agent with explicit instructions
├── Waits for completion
├── Verifies the output (runs tests, checks links, validates format)
├── Iterates if broken (sends feedback to same agent)
└── Reports success to user ONLY when verified

The user never sees broken work. The CEO is the quality gate. This is the whole point.

The Delegation Decision Tree

The hardest part is knowing when to delegate. Delegate too aggressively and you waste time on spawn overhead. Do too much yourself and your CEO drowns in details.

┌─ Is this < 30 seconds?
│     └─ YES → Do it yourself (file read, git status, quick lookup)
│     └─ NO ↓
│
├─ Is this > 5 minutes OR > 3 steps?
│     └─ YES → Delegate to specialist
│     └─ NO ↓
│
├─ Can parts run in parallel?
│     └─ YES → Spawn multiple specialists simultaneously
│     └─ NO → Single specialist
│
└─ Needs domain expertise?
      └─ Coding → haiku coding agent
      └─ Research → haiku researcher agent
      └─ Writing → sonnet writer agent
      └─ Analysis → sonnet or opus analyst

Model Routing Table

Match the model to the task. This directly impacts cost and quality:

Task Type	Model	Timeout	Why
Code builds / test suites	Haiku	15 min	Fast iteration, deterministic tasks
Web research / scraping	Haiku	5 min	Pattern matching, data extraction
Content drafting	Sonnet	3 min	Needs voice, nuance, flow
Strategic analysis	Opus	10 min	Complex reasoning required
Data transformation	Haiku	5 min	Mechanical, well-scoped
Deployment pipelines	Haiku	10 min	Scripted, clear success criteria
Legal / compliance review	Opus	5 min	High-stakes, nuanced judgment

Running your CEO on Opus is correct — it’s making strategic decisions. Running everything on Opus is a mistake — you’ll burn 10× the tokens for tasks that don’t need that reasoning capacity.

Spawning Sub-Agents

In OpenClaw, the CEO spawns specialists using sessions_spawn:

// CEO agent spawning a coding sub-agent
const result = await sessions_spawn({
  agentId: "coding",           // Specialist agent ID from openclaw.json
  task: `
    Build a REST API endpoint for user authentication.

    Requirements:
    - POST /api/auth/login with email + password
    - Returns JWT token on success
    - Rate limiting: 5 attempts per IP per minute
    - Unit tests for happy path + 3 edge cases

    Working directory: ~/projects/myapp/
    Test command: npm test

    Report back with: files modified, test results, any blockers.
  `,
  model: "anthropic/claude-haiku-4",
  timeout: "15m",
  workdir: "~/projects/myapp/"
});

Key fields:

agentId: The specialist’s agent ID (must exist in openclaw.json)
task: The complete, self-contained task description
model: Override model for this specific spawn (optional)
timeout: Hard limit — agent is killed if it exceeds this
workdir: Working directory for file operations

The Verification Loop

This is where most orchestration systems fail. They delegate, receive output, and forward it to the user without checking. That’s a relay, not orchestration.

Real orchestration includes a verification step:

CEO spawns coding agent
    ↓
Agent returns: "Done, code is at src/auth.ts"
    ↓
CEO runs verification:
    cd ~/projects/myapp && npm test
    ↓
Tests pass? → CEO reports success to user
Tests fail? → CEO sends feedback to SAME agent:
              "Tests failing on rate limiter:
               Error: Cannot read property 'count' of undefined
               Fix the rate limiter implementation."
    ↓
Agent returns fix
    ↓
CEO runs tests again...
    ↓
Repeat until passing or timeout

In practice, this looks like a loop:

async function orchestrateCodingTask(task: string, maxRetries = 3) {
  let attempts = 0;

  while (attempts < maxRetries) {
    // Spawn or steer the coding agent
    const result = await sessions_spawn({
      agentId: "coding",
      task: attempts === 0 ? task : `Fix the failing tests: ${lastError}`,
      timeout: "15m"
    });

    // CEO verifies — runs actual tests
    const testResult = await exec("npm test 2>&1");

    if (testResult.exitCode === 0) {
      return { success: true, output: result };
    }

    lastError = testResult.stdout;
    attempts++;
  }

  return { success: false, error: `Failed after ${maxRetries} attempts` };
}

Parallel Execution

The real power multiplier: when tasks are independent, run them simultaneously.

CEO receives: "Analyze competitor pricing and build a pricing comparison page"

Sequential (slow):
  Research competitor pricing  →  12 minutes
  Build comparison page        →  15 minutes
  Total: 27 minutes

Parallel (fast):
  ┌─ Research competitor pricing    →  12 minutes ─┐
  └─ Build comparison page scaffold →  15 minutes ─┘
  CEO integrates results:                           →  2 minutes
  Total: 17 minutes  (37% faster)

Implementation:

// Spawn both agents simultaneously
const [researchResult, buildResult] = await Promise.all([
  sessions_spawn({
    agentId: "researcher",
    task: "Find pricing for: Competitor A, B, C. Format as JSON with {name, tiers, price_per_seat}.",
    timeout: "5m"
  }),
  sessions_spawn({
    agentId: "coding",
    task: "Build pricing comparison page scaffold at src/pages/pricing-comparison.tsx. Leave competitor data as placeholder JSON.",
    timeout: "15m"
  })
]);

// CEO integrates
await sessions_spawn({
  agentId: "coding",
  task: `Integrate this competitor data into the pricing page:
${researchResult.data}

The page scaffold is at src/pages/pricing-comparison.tsx`,
  timeout: "5m"
});

Concrete Example: Full Orchestration

Here’s a complete example — a CEO agent handling a “build me a landing page for my product” request:

CEO analyzes the request

Breaks it into parallel tracks: research (what do top landing pages include?) + build (scaffold from template).

CEO spawns parallel agents

Researcher: "Find 3 best-in-class SaaS landing pages. Extract: hero copy patterns,
social proof placement, CTA placement. Return as structured JSON."

Coder: "Build landing page at src/pages/landing.tsx using our design system.
Include: hero, features, social proof, pricing, CTA. Use placeholder content."

CEO waits for both

Researcher returns in ~3 minutes. Coder returns in ~10 minutes.

CEO integrates

Writer: "Write hero copy for [product]. Use these patterns from top landing pages:
[researcher output]. Target audience: [audience]. Voice: [brand voice]."

CEO verifies

npm run build    # Does it compile?
npm run test     # Do tests pass?
# Open localhost:3000 — does it look right?

CEO iterates on failures

Build error? Sends error to coding agent. Copy feels off? Sends feedback to writer.
CEO delivers

Only after the build passes and the page looks correct does the CEO report back to the user.

Anti-Patterns to Avoid

❌ Over-delegation

Spawning a sub-agent to read a file wastes more time than it saves. Apply the 30-second rule ruthlessly.

❌ No timeout discipline

A stuck agent burns tokens indefinitely. Set explicit timeouts on every spawn. Default: 15 min coding, 5 min research.

❌ Context dumping

Passing 10,000 tokens of memory to a coding agent that only needs 200 tokens of task context. Give specialists what they need, nothing more.

❌ Skipping verification

“The sub-agent said it worked” is not verification. Run the tests. Check the output. You are the quality gate.

❌ Too many concurrent agents

Each active agent consumes API resources. Cap at 5 concurrent children per session. Queue the rest.

❌ New session on retry

Starting a fresh sub-agent for every retry loses context. Steer the existing agent — it knows what it already tried.

Setting Up Specialist Agents

{
  "agents": {
    "main": {
      "workspace": "~/clawd",
      "model": "anthropic/claude-opus-4",
      "description": "CEO orchestrator"
    },
    "coding": {
      "workspace": "~/clawd/specialists/coding",
      "model": "anthropic/claude-haiku-4",
      "description": "Builds and tests software",
      "soul": "You are a coding specialist. You write clean, tested code. Always run tests before reporting success. Report back: files changed, test output, any blockers."
    },
    "researcher": {
      "workspace": "~/clawd/specialists/research",
      "model": "anthropic/claude-haiku-4",
      "description": "Web research and data extraction",
      "soul": "You are a research specialist. You find accurate, recent information. Always cite sources. Return structured JSON when asked. Be skeptical of unverified claims."
    },
    "writer": {
      "workspace": "~/clawd/specialists/writing",
      "model": "anthropic/claude-sonnet-4",
      "description": "Content and communications drafting",
      "soul": "You are a writing specialist. You write with a clear, human voice. No corporate jargon. No filler. Match the tone you're given. Show, don't tell."
    }
  }
}

What You Now Have

A CEO agent that can:

Analyze any request and break it into parallel workstreams
Route tasks to the right specialist with the right model
Verify results before they reach the user
Iterate on failures with targeted feedback
Run 37%+ faster by parallelizing independent tasks

The result: You give the CEO a goal, and you get back a finished product — not a “here’s what I tried, good luck.”

About the author: JD Davenport builds AI agent systems at OpenClaw. Follow on LinkedIn for updates on building AI agents for business.