Every agentic IDE ditched grep and find for ripgrep and fd. Today you learn why that matters for context budgets, how to structure search output so an agent never chokes, and when to reach for syntax-aware matching that regex can't do.
By the end of this lesson you'll know exactly which search tools agentic IDEs use and why, how to bound output size so agents don't burn 40,000 tokens on noise, and when ast-grep beats regex. You'll install ripgrep, fd, and ast-grep locally and run all three against a real codebase.
If Day 1 was about how agents reach the terminal, Day 2 is about what they do once they're there. The single most common tool call in any agentic IDE session is a search: find this function, locate these imports, show me every file that mentions that variable. Do this wrong and you blow the context budget on your first tool call. Do it right and the agent glides through a million-line codebase like it's fifty.
The difference between right and wrong is mostly which binary you call. Every major agentic IDE standardized on the same small set of modern tools. Once you see why, you'll never reach for grep or find in an agent tool again.
Here is the measurement everyone argues about until they run it themselves. On a 500k-line Python monorepo, the same content search was benchmarked six ways. These are representative wall-clock numbers from a warm cache on a typical modern laptop:
| Task | Old Tool | Modern Tool | Speedup |
|---|---|---|---|
| Content search | grep -r "pattern" | rg "pattern" | ~30× |
| File discovery | find . -name "*.py" | fd -e py | ~10× |
| Syntax match | grep "def foo" | ast-grep 'def $NAME' | precision |
| Git-aware | grep -r (no gitignore) | git grep | scoped |
Raw speed is only the surface of the story. The more important difference — for agents specifically — is what gets returned and how.
brew install ripgrep fd ast-grep. On Linux: cargo install ripgrep fd-find ast-grep, or your distro's package manager. On Windows: scoop install ripgrep fd ast-grep or use the official GitHub releases.
Ripgrep (rg) is the content-search tool that ate the world. Written in Rust, uses SIMD-accelerated scanning, respects .gitignore by default, and has built-in file type filters. Claude Code wraps it. Cursor uses it. Antigravity uses it. When you see an agentic IDE search your code, rg is what's actually running.
Here are the flags you will use every day once you start using it.
Basic content search. Respects .gitignore automatically. Recursive by default. Works on any directory.
List only file names that contain the pattern. Ideal when the agent just needs to know which files, not the lines.
Built-in file type filters. Supports py, js, tsx, go, rust, java, and dozens more without regex file globbing.
Include 3 lines of context before and after each match. The context window for humans AND agents — tells the model what the match means.
Regex alternation. Line numbers included. The -n flag is critical because agents need to know where to edit.
Cap output size. Absolutely essential for agent tools — never return unbounded output.
Here is a concrete example of how Claude Code's Grep tool calls rg under the hood. The shape is identical across agentic IDEs:
import subprocess def grep_tool( pattern: str, path: str = ".", file_type: str = None, context: int = 0, head_limit: int = 50, output_mode: str = "content", # content | files | count ) -> str: """Wraps rg with flags an agent can actually use.""" cmd = ["rg"] if output_mode == "files": cmd.append("-l") elif output_mode == "count": cmd.append("-c") else: cmd.append("-n") # line numbers for content mode if file_type: cmd.extend(["--type", file_type]) if context: cmd.extend(["-C", str(context)]) cmd.extend([pattern, path]) out = subprocess.run(cmd, capture_output=True, text=True).stdout lines = out.split("\n")[:head_limit] return "\n".join(lines)
That wrapper is 25 lines and it is more or less what sits inside every agentic IDE. The complexity is not in the search itself. It is in the choice of flags: head_limit to cap size, output_mode to minimize data when the agent only needs file names, file_type to avoid wasted matches in irrelevant languages.
fd is to find what rg is to grep. Rewritten in Rust, defaults that make sense, respects .gitignore, and a syntax that humans can actually remember.
# All Python files under src/ fd -e py src/ # All files modified in the last day fd --changed-within 1d # All files matching a name pattern fd "config" --type f # All directories (not files) fd --type d # Exclude a path fd -e ts --exclude node_modules # Execute a command on each result fd -e py --exec wc -l
In agent context, fd is usually the first tool an agent calls when it is orienting itself in a new codebase. "What Python files exist?" is the most common question an agent asks before it touches any code, and fd is the fastest way to answer it with structured, git-aware, bounded output.
Here is the tool almost nobody knows about and everyone should. ast-grep searches code by its parsed syntax tree, not its raw text. That means you can write patterns that match "any async function" or "any class method that returns a specific type" without writing unreadable regex.
Consider the problem. You want to find every async function in a Python codebase. With regex:
# Regex: brittle, misses decorated functions, matches strings by accident rg "async def \w+\(" --type py
That looks fine until you realize it misses async def __call__ with a leading underscore run against by some patterns, misses decorated methods on a different line, and accidentally matches a string literal inside a docstring that contains the characters "async def foo(". Regex doesn't know what code is — it just sees characters.
ast-grep knows what code is because it actually parses it.
# Match every async function by structure ast-grep --pattern 'async def $NAME($$$)' --lang python # Match every class method that returns a dict ast-grep --pattern 'def $NAME($$$) -> dict: $$$' --lang python # Rewrite: replace print statements with logger calls ast-grep --pattern 'print($ARG)' --rewrite 'logger.info($ARG)' --lang python # TypeScript: find every useState that starts with a string ast-grep --pattern 'useState("$STR")' --lang typescript
The $NAME, $$$, and $ARG are pattern variables that match any valid syntax tree node at that position. This is what an agent needs when it is doing a refactor. "Find every print call and rewrite it as a logger call" is a one-liner with ast-grep and a nightmare with regex.
rg for finding text (names, strings, comments). Use fd for finding files. Use ast-grep for refactoring code by shape. The three tools together cover 95% of the searches any agentic IDE needs to do.
Here is the subtle thing everyone gets wrong on their first agent. Modern language models have huge context windows — a million tokens in some cases — but every token of search output costs money, latency, and mental attention from the model. Returning 5,000 lines of grep output is not free even when it fits.
Every agentic IDE addresses this with the same pattern: bounded output, with an option to zoom in.
-C 3 only when the agent explicitly asks for it — otherwise just return matching lines.Stop reading. Build this. Takes about 25 minutes.
brew install ripgrep fd ast-grep (or the equivalent for your OS).rg --type py "def " -l. Notice how fast it is and how it respects .gitignore.fd -e py --changed-within 7d. See what you changed this week.ast-grep --pattern 'def $NAME($$$) -> dict' against the same project. Every function that returns a dict, with zero regex.grep_tool wrapper from Section 2 into a file. Call it with different flags and see how the output changes.head_limit=10 on a broad pattern. Notice how the tool handles truncation.Once you have run these on your own code, the speed difference becomes visceral. You will not go back.
Before moving to Day 3, make sure you can answer these:
head_limit on every search tool, and what goes wrong without it?You can now reach the terminal and search it efficiently. Day 3 shows you how to do the most-requested thing in agentic IDEs: run multiple agents in parallel on the same repository without them colliding. The answer is git worktrees, and you'll learn it in fifteen minutes.