Pinecone is the most established managed vector database service. Founded in 2019, it's the production default for teams that want vector search without ops overhead. Pinecone's serverless offering (launched 2024) lets you pay per query and storage rather than provisioning fixed capacity, which dramatically reduces costs for bursty workloads.
You push embeddings (from any model — OpenAI, Cohere, Voyage, or self-hosted) into Pinecone via their SDK or REST API. Each vector has an ID, the embedding itself, and optional metadata for filtering. Queries return the top-k most similar vectors by cosine similarity (or dot product / euclidean). Metadata filters let you combine vector search with structured constraints. Pinecone handles sharding, replication, and scaling under the hood.
Serverless: $0.33 per 1M write units, $8.25 per 1M read units, storage $0.33/GB/month. Small workloads can stay under $10/month. Pod-based pricing: $70/month for a starter s1.x1 pod (5M vectors). Enterprise tier available.
Notion, Gong, CS Disco, Clarabridge, Automattic, and thousands of AI startups. The default managed vector DB for teams that want zero ops.