Day 4: Multi-Agent Systems — Agents Working Together

What you'll build today

A research pipeline with 3 agents: a Planner that decomposes tasks, a Researcher that gathers information, and a Writer that synthesizes into a final report. Plus a Debate pattern that gets two agents to argue both sides of a question before synthesizing a better answer.

Why Multi-Agent

When one agent isn't enough

A single agent has one context window, one set of tools, and one "personality." For complex tasks, this creates bottlenecks:

Long-running tasks fill the context window — earlier reasoning gets summarized away
Specialists outperform generalists — a writing agent with writing-specific instructions beats a general agent at writing
Parallelism — multiple workers can run tasks concurrently, faster than sequential steps
Error isolation — if one worker fails, the orchestrator can retry just that step

Multi-agent systems solve these problems by dividing work across agents, each focused on their specialty.

The Code

Research pipeline: Planner + Researcher + Writer

Pythonagent_day4.py

import anthropic, json
from dataclasses import dataclass, field
from typing import Optional

client = anthropic.Anthropic()

# ── Base agent (same loop from Days 1-3) ──────────
def simple_call(system: str, prompt: str, model="claude-sonnet-4-5") -> str:
    """Single-turn call to Claude with a system prompt."""
    resp = client.messages.create(
        model=model,
        max_tokens=2048,
        system=system,
        messages=[{"role":"user","content":prompt}]
    )
    return resp.content[0].text

# ── Specialized agents ─────────────────────────────
def planner_agent(task: str) -> list[str]:
    """Decomposes a research task into 3-5 specific sub-questions."""
    print("[Planner] Decomposing task...")
    system = """You are a research planner. Break down research tasks into
    3-5 specific, focused sub-questions that together answer the main question.
    Return a JSON array of strings. Only return the JSON, no other text."""
    result = simple_call(system, f"Task: {task}")
    try:
        return json.loads(result)
    except:
        # Fallback: extract lines if JSON parsing fails
        return [line.strip() for line in result.split("\n")
                if line.strip()][:5]

def researcher_agent(question: str) -> str:
    """Answers a specific research question with detail and examples."""
    print(f"[Researcher] Investigating: {question[:60]}...")
    system = """You are a thorough research analyst. Answer questions with:
    - Specific facts and figures when available
    - Concrete examples to illustrate points
    - Honest assessment of what you know vs what's uncertain
    Keep responses focused and evidence-based, 150-250 words."""
    return simple_call(system, question)

def writer_agent(task: str, research: dict) -> str:
    """Synthesizes research into a coherent final report."""
    print("[Writer] Synthesizing final report...")
    research_text = "\n\n".join(
        f"Q: {q}\nA: {a}" for q, a in research.items()
    )
    system = """You are a clear, engaging writer who synthesizes research into readable reports.
    Structure: Executive summary (2-3 sentences) → Key findings (bullet points) →
    Implications → Conclusion. Write for a knowledgeable but non-specialist audience."""
    prompt = f"""Original task: {task}

Research findings:
{research_text}

Write the final report."""
    return simple_call(system, prompt)

# ── Orchestrator: runs the full pipeline ──────────
def research_pipeline(task: str) -> str:
    print(f"\n=== Research Pipeline ===\nTask: {task}\n")

    # Step 1: Planner breaks task into sub-questions
    questions = planner_agent(task)
    print(f"Sub-questions: {questions}\n")

    # Step 2: Researcher answers each sub-question
    research = {}
    for q in questions:
        research[q] = researcher_agent(q)

    # Step 3: Writer synthesizes everything
    report = writer_agent(task, research)
    return report

# ── Debate pattern: two agents argue, then synthesize
def debate(question: str) -> str:
    print(f"\n=== Debate Pattern ===\nQuestion: {question}\n")

    # Agent A argues FOR
    print("[Agent A] Arguing FOR...")
    for_arg = simple_call(
        "You must argue FOR the proposition. Be specific, cite evidence, be persuasive.",
        f"Argue for: {question}"
    )

    # Agent B argues AGAINST (sees Agent A's argument)
    print("[Agent B] Arguing AGAINST...")
    against_arg = simple_call(
        "You must argue AGAINST the proposition. Critique the FOR argument specifically.",
        f"Proposition: {question}\n\nFOR argument:\n{for_arg}\n\nNow argue against."
    )

    # Synthesizer finds the nuanced truth
    print("[Synthesizer] Finding nuanced answer...")
    synthesis = simple_call(
        "You are a fair judge. Given two opposing arguments, find the nuanced truth. "
        "Acknowledge what's right in each side. Give a balanced, honest conclusion.",
        f"""Question: {question}

FOR:
{for_arg}

AGAINST:
{against_arg}

What is the balanced truth?"""
    )
    return synthesis

# ── Run both patterns ─────────────────────────────
if __name__ == "__main__":
    # Research pipeline
    report = research_pipeline(
        "What are the main challenges companies face when adopting AI in 2024?"
    )
    print("\n=== Final Report ===\n", report)

    # Debate pattern
    answer = debate("Should companies require AI literacy for all employees?")
    print("\n=== Debate Synthesis ===\n", answer)

The key insight: Each agent has a focused system prompt that makes it better at its specific job. The planner is better at decomposition. The researcher is better at thoroughness. The writer is better at clarity. A single general agent does all three adequately; specialists do each one well.

Design Patterns

When to use each pattern

Orchestrator + Workers

Use when: you have a complex task that can be broken into distinct subtasks, and each subtask benefits from specialization. The research pipeline above is an example. So is: a content pipeline with a researcher, editor, and formatter. A code review system with a security auditor, performance reviewer, and style checker.

Debate Pattern

Use when: you need balanced analysis on a question where confirmation bias is a risk. The debate pattern forces argument on both sides before synthesis, which consistently produces more nuanced outputs than a single-agent "think about this carefully" prompt.

Cost consideration: Multi-agent systems make multiple API calls. The research pipeline above makes 1 (planner) + N (researchers) + 1 (writer) calls. For 4 sub-questions, that's 6 API calls. The debate pattern makes 3 calls. Know your call count before deploying at scale.

Day 4 Challenge

Complete before Day 5

Run the research pipeline with your own topic and read the output
Run the debate pattern and compare the FOR/AGAINST/synthesis quality vs asking Claude once
Add a 4th agent: a fact_checker_agent that reviews the final report for unsupported claims
Add parallel execution: use Python's concurrent.futures to run researcher calls in parallel

Tomorrow: Production. Error handling, cost controls, logging, and deployment. The final piece.

← PreviousDay 3: Memory Next →Day 5: Production Agents