What It Is

Together AI hosts the largest catalog of open-source LLMs in the world — 200+ models across Llama, Mistral, Mixtral, Qwen, DeepSeek, Gemma, and more. A single API key gives you access to everything, with managed fine-tuning, dedicated endpoints, and fast inference.

How It Works

Together's API is OpenAI-compatible. You can switch models by changing a single string in your request. Fine-tuning is managed end-to-end — upload your dataset, pick a base model, and Together handles the training, checkpointing, and serving. Dedicated endpoints give you reserved capacity for predictable latency and throughput. They also offer image generation (FLUX, SDXL) and embedding models alongside LLMs.

Pricing Breakdown

Llama 3.1 70B: $0.88 per M tokens (blended). Llama 3.1 8B: $0.18. Mixtral 8x22B: $1.20. Fine-tuning: $3-$20 per million training tokens depending on base model. Dedicated endpoints: $3-$10/hour depending on GPU type. Pay as you go.

Who Uses It

Pika Labs, Arcee AI, Labelbox, Nomic AI, and hundreds of startups building on open models. The default choice when Groq doesn't have the model you need.