What It Is

Cohere Embed v3 is the enterprise-focused embedding model family. Its strongest features are multilingual support (100+ languages in a single model), compression-friendly design (lower quantization tolerance), and the optional Rerank model that dramatically improves retrieval precision at minimal additional cost.

How It Works

Embed v3 comes in English and multilingual variants at different sizes. The model supports int8 and binary quantization with minimal quality loss — you can reduce storage by 75% while retaining most retrieval quality. For a two-stage retrieval pipeline, you pair Embed (fast initial retrieval of top 100) with Rerank (precise reordering of top 10-20).

Pricing Breakdown

Embed v3: $0.10 per M tokens. Rerank v3: $1 per 1000 searches. Enterprise dedicated deployments priced separately with custom SLAs.

Who Uses It

Enterprise search teams, multilingual applications, and RAG pipelines that care about retrieval precision. Less common in startups, dominant in enterprise RAG.