The Delegation Decision Tree: When to Spawn Sub-Agents

The hardest skill in multi-agent orchestration isn’t building agents — it’s knowing when to use them. Delegate too much and you waste resources on overhead. Delegate too little and your CEO agent drowns in execution details.

This framework gives you a repeatable decision process.

The Decision Tree

┌─────────────────────────────────────┐
│         New Task Received           │
└──────────────┬──────────────────────┘
               │
        Is this < 30 seconds?
        ┌──────┴──────┐
       YES            NO
        │              │
   Do it yourself   Is this > 5 min
        │           OR > 3 steps?
        │         ┌────┴────┐
        │        YES        NO
        │         │          │
        │    DELEGATE    Could parts
        │         │      run in parallel?
        │         │     ┌────┴────┐
        │         │    YES        NO
        │         │     │          │
        │         │  PARALLEL   SINGLE
        │         │  SPAWN      DELEGATE
        │         │     │          │
        └─────────┴─────┴──────────┘
                        │
              Select model & timeout

Task Classification Guide

✅ Handle Directly (< 30 seconds)

These tasks have more spawn overhead than execution cost:

Reading a config file
Checking git status
Looking up a value in memory
Simple math or date calculations
Answering from existing context

🔀 Delegate to Specialist (> 5 min OR > 3 steps)

These justify the spawn overhead:

Building or modifying code
Running test suites
Web research with multiple queries
Content drafting (articles, emails)
Deployment pipelines
Data processing or transformation

⚡ Parallel Spawn (independent subtasks)

When tasks don’t depend on each other’s output:

Research competitors AND build landing page
Write 3 articles simultaneously
Deploy frontend AND run backend tests
Scan email AND check calendar AND monitor Twitter

Model Selection Matrix

Match the model to the cognitive load (pricing as of April 2026 — verify at anthropic.com/pricing):

Task Type	Model	Cost/M tokens	Timeout	Rationale
Code generation	Haiku	$0.80 in / $4 out	15 min	Follows instructions, fast iteration
Code review	Sonnet	$3 in / $15 out	10 min	Needs to understand intent + quality
Web scraping	Haiku	$0.80 in / $4 out	5 min	Pattern matching, data extraction
Content writing	Sonnet	$3 in / $15 out	3 min	Voice, nuance, creativity
Strategic analysis	Opus	$15 in / $75 out	10 min	Complex multi-factor reasoning
Data transformation	Haiku	$0.80 in / $4 out	5 min	Mechanical, well-defined rules
Monitoring/alerts	Haiku	$0.80 in / $4 out	2 min	Simple checks, binary outcomes

Timeout Discipline

Every spawn needs an explicit timeout. No exceptions.

Setting Timeouts

Timeout = Expected duration × 2

If a coding task should take 7 minutes, set a 15-minute timeout. This provides buffer for retries without allowing infinite loops.

When Timeout Fires

Kill the agent — don’t let it keep burning tokens
Assess the situation — was it stuck, or was the task genuinely complex?
Retry with adjustments:
- Break the task into smaller pieces
- Provide more specific instructions
- Try a different model
- Increase the timeout if the task was legitimately complex

Default Timeouts

Category	Default	Max
Coding	15 min	30 min
Research	5 min	10 min
Writing	3 min	5 min
Monitoring	2 min	3 min
Deployment	10 min	20 min

Context Management

What to Pass

Give sub-agents the minimum context needed for their task:

✅ Specific task description
✅ Relevant file paths
✅ Technical constraints
✅ Expected output format

What NOT to Pass

❌ Full MEMORY.md (security + token waste)
❌ Unrelated project context
❌ Personal information unless needed
❌ Full conversation history

# Good task prompt
"Build a React form component in src/components/ContactForm.tsx.
Fields: name (required), email (required, validated), message (textarea).
Use Tailwind for styling. Run tests after building."

# Bad task prompt
"Here's everything about my life, my projects, my memories...
oh and also build a form."

Concurrency Limits

Running too many agents simultaneously causes:

API rate limiting
Resource contention
Difficult result synthesis
Token budget blowouts

Recommended limits:

Tier	Max Concurrent	Use Case
Conservative	2	Learning, budget-constrained
Standard	5	Normal operations
Aggressive	10	Time-critical, budget available

The Complete Workflow

1. Task arrives
2. Classify (< 30s? > 5 min? Parallel?)
3. Select model + timeout
4. Prepare minimal context
5. Spawn agent(s)
6. Wait for completion
7. VERIFY results (run tests, check output)
8. If broken → iterate (same agent, refined instructions)
9. If passing → deliver to user

The verification step is non-negotiable. Read the orchestration guide for why skipping verification is the #1 orchestration failure mode.

About the author: JD Davenport builds AI agent systems at OpenClaw. Follow on LinkedIn for updates on building AI agents for business.