Starting an agent is easy. Stopping one is the actual production engineering problem. Today you learn the four stopping conditions every serious agentic IDE combines — and why you need all four.
By the end of this lesson you will build an agent loop with all four production stopping conditions, understand why runaway agents are the most common failure mode in shipped agentic products, and know the priority order for combining the four conditions.
A surprisingly common rookie mistake: wiring up a Claude API loop without a stopping condition and letting it run for nine hours on an impossible task, burning fifty dollars in API credits before anyone notices. It is easy to do. You forget to set max_iters, the task turns out to be unreachable, and the model patiently generates variations, runs tests, and generates more variations with complete confidence. The Anthropic dashboard lights up. The lesson sticks permanently.
Every serious production agent has to avoid this story. The only way is to design the loop for termination from the first line. Today's lesson covers how.
Giving an agent a loop is trivial. Call the model, execute its tool calls, feed the results back, repeat. That is maybe twenty lines of Python. What is hard is knowing when you have actually arrived at the destination.
Consider the failure modes of a naive loop that has no stopping condition beyond "keep going":
Every serious agentic IDE uses some combination of these four. None of them is sufficient alone. All four are necessary in production.
The agent itself calls a done tool that signals it has finished. This is the ideal stop — the model decides, not the harness. Claude Code and Cline both prefer this pattern.
If the task is framed as "make the test suite pass," the stop is obvious: run the tests, exit on green. Most real coding tasks can be framed this way, and when they can, it's the most reliable stop signal in existence.
Hard ceiling. The agent gets N tokens, M wall-clock seconds, or K tool calls. When exhausted, the loop halts and reports whatever progress it made. Prevents runaway costs. Every production system has this.
Simplest safeguard. After K loop iterations, halt. Not the most elegant but it's a reliable backstop against infinite loops and lets you tune behavior with one knob. Typical values: 10 for simple tasks, 30 for complex ones, 100 for research runs.
Here is a real, runnable agent loop with all four stopping conditions. 40 lines of Python. You can copy this into a file right now and build on it.
from anthropic import Anthropic client = Anthropic() TOOLS = ["bash", "read_file", "edit_file", "done"] def run_agent( task: str, max_iters: int = 20, max_tokens: int = 100_000, success_check=None, ) -> dict: messages = [{"role": "user", "content": task}] tokens_used = 0 for i in range(max_iters): resp = client.messages.create( model="claude-sonnet-4-6", max_tokens=4096, tools=TOOLS, messages=messages, ) messages.append({"role": "assistant", "content": resp.content}) tokens_used += resp.usage.input_tokens + resp.usage.output_tokens # STOP 1: explicit completion if any(b.type == "tool_use" and b.name == "done" for b in resp.content): return {"status": "complete", "iters": i + 1, "tokens": tokens_used} # STOP 2: success check (e.g., tests pass) if success_check and success_check(): return {"status": "success", "iters": i + 1, "tokens": tokens_used} # STOP 3: budget cap if tokens_used >= max_tokens: return {"status": "budget_exhausted", "iters": i + 1, "tokens": tokens_used} # Execute any tool calls the model issued tool_results = [] for b in resp.content: if b.type == "tool_use": result = execute_tool(b.name, b.input) tool_results.append({"type": "tool_result", "tool_use_id": b.id, "content": result}) if tool_results: messages.append({"role": "user", "content": tool_results}) # STOP 4: max iterations return {"status": "max_iters", "iters": max_iters, "tokens": tokens_used}
This is the same shape Cline uses, the same shape Claude Code uses, the same shape aider uses. Different languages, different tools, different prompts — same loop structure.
The values above are production defaults but every task needs its own tuning. A rough guide for sizing limits on a new task:
None. This is the single most common production bug in agent engineering. Forget this one thing and you get the nine-hour-fifty-dollar story. The budget cap is not a nice-to-have — it is the only thing standing between you and an infinite loop.
The other failure mode is the opposite: the loop stops too early. The model calls done prematurely because it is overconfident, or the success check returns true on a test suite that was passing before the agent touched anything. Both are real.
The mitigations are specific:
done: make the done tool's schema require the model to summarize what it actually accomplished. Forcing a summary forces the model to actually check.git diff and feed it back. The model often catches its own mistakes when shown the actual diff.Copy the 40-line loop above into a file called my_agent.py. Wire up an Anthropic API key. Then deliberately break it in each of the four ways so you see each stopping condition fire.
success_check that returns True after a few iterations. Watch it return "success" mid-loop.done and return "complete".Seeing all four fire on the same code in one session is the single clearest way to internalize how production agent loops terminate. It takes 20 minutes and locks the concept in permanently.
Before Day 5, make sure:
max_tokens should never be left unbounded.done is preferable to any harness-side stop.Day 5 is the capstone. You take everything from days 1-4 — shell tool, search, worktrees, stopping conditions — and assemble them into a working agentic coder of your own. Under 150 lines of Python. You will be able to point it at any repo and watch it actually work.