Promptfoo

CLI for prompt testing and eval

Evaluation Free (OSS)
Visit Official Site →

What It Is

Promptfoo is a CLI-first tool for evaluating prompts across multiple models, running red-team tests, and comparing outputs side-by-side. Its declarative YAML config makes it easy to run the same prompts across OpenAI, Anthropic, Gemini, and local models, then see a comparison table.

How It Works

Define evals in a promptfooconfig.yaml file — list your providers (models), prompts, test cases, and assertions. Run `promptfoo eval` and it executes each prompt against each provider, runs assertions, and outputs a comparison table. Includes built-in red-team capabilities: prompt injection, jailbreaking, bias testing, and adversarial examples. Also has a web UI for exploring results.

Pricing Breakdown

Promptfoo CLI: free and open source. Promptfoo Cloud (managed UI + team features): free tier, paid tiers for enterprise.

Who Uses It

Developers doing prompt engineering, security teams red-teaming LLM apps, and anyone comparing models. Popular for A/B testing prompt variations.

Strengths & Weaknesses

✓ Strengths

  • CLI-first workflow
  • Red-team features
  • Provider-agnostic
  • Side-by-side comparison

× Weaknesses

  • Less deep than DeepEval
  • CLI-only setup
  • YAML config can get long

Best Use Cases

Prompt A/B testingRed teamingModel comparisonCI for prompts

Alternatives

Ragas
Open-source RAG evaluation
DeepEval
Unit-testing framework for LLMs
← Back to AI Tools Database