Day 05 Capstone

Build Your Own Agent Loop

Everything from days 1-4 assembled into a working agentic coder in under 150 lines of Python. Shell tool, search, worktree isolation, four stopping conditions. Run it against any repo and watch it actually work.

~2 hours Capstone Hands-on Claude API key required

Today's Objective

By the end of this lesson you will have a working 150-line Python agent that can be pointed at any git repo, given a natural-language task, and watched as it edits files, runs tests, and stops cleanly. You will own every line. Nothing is hidden.

Everything in this course so far has been primitives. Day 1 gave you the shell tool. Day 2 gave you efficient search. Day 3 gave you parallel execution via git worktrees. Day 4 gave you a loop that stops correctly. Today all four collide into a single Python file that is, functionally, a minimal agentic coder. It will not be as polished as Claude Code, Cursor, or Antigravity — it has no UI, no streaming, no multi-file diff review. But it will have every primitive those polished products are built on, and once you have built one, the polished products stop feeling mysterious.

The target is under 150 lines of Python. The only external dependency is the Anthropic SDK. Everything else is standard library.

01

Prerequisites

Before starting, make sure these are in place.

Use a worktree. Don't run your new agent against a repo you care about. Create a git worktree (Day 3!) and point the agent at that. If the agent misbehaves, the worktree costs nothing to throw away.
02

The Architecture

Five components, stacked in this order:

  1. Tool definitions — JSON schemas telling Claude what it can do: run bash, read file, edit file, search, signal done.
  2. Tool executors — the Python functions that actually do the work when the model emits a tool call.
  3. Agent loop — the for i in range(max_iters) that calls the model, executes tool calls, and checks stopping conditions.
  4. Stop logic — the four stopping conditions from Day 4 in priority order.
  5. Main — CLI entry point that reads the task from argv and runs the loop.
03

The 150-Line Agent

Here is the full file. Create it as agent.py. Every line matters. None are there for decoration.

agent.py
Python
#!/usr/bin/env python3
"""A minimal agentic coder. 150 lines. Under 200 if you're strict."""
import os, sys, subprocess, json
from anthropic import Anthropic

client = Anthropic()
MODEL  = "claude-sonnet-4-6"
ROOT   = os.path.realpath(".")

# ---- Tool definitions ----
TOOLS = [
    {
        "name": "bash",
        "description": "Run a bash command in the project root. Use for tests, git, installs.",
        "input_schema": {
            "type": "object",
            "properties": {"command": {"type": "string"}},
            "required": ["command"],
        },
    },
    {
        "name": "read_file",
        "description": "Read a file. Returns content.",
        "input_schema": {"type": "object", "properties": {"path": {"type": "string"}}, "required": ["path"]},
    },
    {
        "name": "edit_file",
        "description": "Replace old_str with new_str in path. Fails if old_str not unique.",
        "input_schema": {
            "type": "object",
            "properties": {
                "path": {"type": "string"},
                "old_str": {"type": "string"},
                "new_str": {"type": "string"},
            },
            "required": ["path", "old_str", "new_str"],
        },
    },
    {
        "name": "search",
        "description": "Search file contents with ripgrep. Returns matches.",
        "input_schema": {"type": "object", "properties": {"pattern": {"type": "string"}}, "required": ["pattern"]},
    },
    {
        "name": "done",
        "description": "Signal that the task is complete. Include a summary of what was accomplished.",
        "input_schema": {"type": "object", "properties": {"summary": {"type": "string"}}, "required": ["summary"]},
    },
]

# ---- Tool executors ----
DENY = ["rm -rf /", "sudo", "curl | sh"]

def exec_bash(command: str) -> str:
    if any(d in command for d in DENY):
        return f"DENIED: {command}"
    print(f"$ {command}")
    try:
        p = subprocess.run(command, shell=True, cwd=ROOT, capture_output=True, text=True, timeout=120)
        out = (p.stdout + p.stderr)[-20000:]
        return f"exit={p.returncode}\n{out}"
    except subprocess.TimeoutExpired:
        return "TIMEOUT"

def exec_read(path: str) -> str:
    full = os.path.realpath(os.path.join(ROOT, path))
    if not full.startswith(ROOT): return "ERROR: path escape"
    try:
        with open(full) as f: return f.read()[:30000]
    except Exception as e: return f"ERROR: {e}"

def exec_edit(path: str, old_str: str, new_str: str) -> str:
    full = os.path.realpath(os.path.join(ROOT, path))
    if not full.startswith(ROOT): return "ERROR: path escape"
    with open(full) as f: content = f.read()
    if content.count(old_str) != 1:
        return f"ERROR: old_str appears {content.count(old_str)} times, must be exactly 1"
    with open(full, "w") as f: f.write(content.replace(old_str, new_str))
    return f"Edited {path}"

def exec_search(pattern: str) -> str:
    p = subprocess.run(["rg", "-n", pattern, ROOT], capture_output=True, text=True)
    return "\n".join(p.stdout.split("\n")[:50])

EXECUTORS = {"bash": exec_bash, "read_file": exec_read, "edit_file": exec_edit, "search": exec_search}

# ---- Agent loop ----
def run_agent(task: str, max_iters: int = 20, max_tokens: int = 100_000) -> dict:
    messages = [{"role": "user", "content": task}]
    tokens = 0

    for i in range(max_iters):
        resp = client.messages.create(
            model=MODEL,
            max_tokens=4096,
            tools=TOOLS,
            messages=messages,
        )
        messages.append({"role": "assistant", "content": resp.content})
        tokens += resp.usage.input_tokens + resp.usage.output_tokens

        # Stop 1: explicit done
        done_blocks = [b for b in resp.content if getattr(b, "type", None) == "tool_use" and b.name == "done"]
        if done_blocks:
            return {"status": "complete", "iters": i + 1, "tokens": tokens, "summary": done_blocks[0].input.get("summary", "")}

        # Stop 3: budget (stop 2 success check omitted — pass test cmd via task instead)
        if tokens >= max_tokens:
            return {"status": "budget", "iters": i + 1, "tokens": tokens}

        # Execute tool calls
        tool_results = []
        for b in resp.content:
            if getattr(b, "type", None) == "tool_use":
                fn = EXECUTORS.get(b.name)
                result = fn(**b.input) if fn else f"unknown tool: {b.name}"
                tool_results.append({"type": "tool_result", "tool_use_id": b.id, "content": result})
        if tool_results:
            messages.append({"role": "user", "content": tool_results})

    # Stop 4: max iters
    return {"status": "max_iters", "iters": max_iters, "tokens": tokens}

# ---- Main ----
if __name__ == "__main__":
    if len(sys.argv) < 2:
        print("Usage: agent.py '<task>'")
        sys.exit(1)
    task = " ".join(sys.argv[1:])
    result = run_agent(task)
    print(json.dumps(result, indent=2))

That is the whole thing. 150 lines or so, depending on how you count blank lines and comments. Every primitive from the first four days is in there.

04

Run It

Put your Anthropic API key in your shell:

run.sh
Bash
export ANTHROPIC_API_KEY="sk-ant-..."

# Create a worktree to work in, then cd there
cd ~/code/myproject
git worktree add ../myproject-agent-test exp/agent-test
cd ../myproject-agent-test

# Run the agent
python agent.py "Add a docstring to every function in src/utils.py"

# Inspect what it did
git diff

You should see the model call tools in sequence: it will probably search for functions in src/utils.py, read_file each one, edit_file to add the docstring, and then call done with a summary. Run git diff afterward to see exactly what it changed.

Start with a safe task. "Add a docstring" is a good first run because the worst case is slightly awkward docstrings — no data loss, no broken tests. Once that works, try something harder: "Fix the failing tests in tests/test_parser.py."
05

What's Missing (On Purpose)

This is a minimal agent. Compared to Claude Code or Cursor, it is missing:

Each one is a weekend project if you want to add it. The fact that you can name everything missing is proof that you understand what an agentic IDE actually is — a loop, a handful of tools, and good taste about when to stop.

06

Where to Go Next

01

Add MCP support

Wire your agent up to any MCP server. Now it can talk to your database, your issue tracker, your file system abstraction. 50 additional lines, unlocks the entire MCP ecosystem.

02

Add tests-pass stopping

Take a test_cmd argument, run it after each iteration, return success if exit code is 0. Turns the agent into a test-driven coder. 10 lines.

03

Spawn sub-agents in worktrees

Add a spawn_agent tool that creates a git worktree, calls run_agent recursively inside it, and returns the diff. Now you have parallel sub-agents. Very powerful, mildly dangerous — cap recursion depth.

04

Read Cline's source and compare

Clone github.com/cline/cline, find the agent loop file, and diff the shape against what you just built. Almost every line you have has a counterpart in Cline, plus the polish (UI, streaming, approvals) around them.

Supporting Videos & Reading

Keep going after the course.

Course Complete

You Built An Agentic Coder

Five days ago you did not know how Claude Code, Cursor, or Antigravity worked under the hood. Today you have a working agent you wrote yourself. Every primitive in the big tools is now familiar. Nothing about them is magic anymore.

Return to Course Home