Vector databases are a hot topic in AI. But if you already have PostgreSQL, you don't need a separate vector database. The pgvector extension adds native vector storage and similarity search directly to Postgres.
An embedding is a list of numbers (a vector) that represents the meaning of a piece of text. Texts with similar meanings have vectors that are close together in space. This is how semantic search works — instead of matching keywords, you match meaning.
For example, "How do I cancel my subscription?" and "I want to stop paying for this service" are very different strings but have very similar embeddings — so a semantic search finds them both when you search for subscription cancellation.
# macOS with Homebrew
brew install pgvector
# Ubuntu
sudo apt-get install postgresql-16-pgvector
# Or build from source
git clone https://github.com/pgvector/pgvector.git
cd pgvector && make && sudo make install
-- Enable in your database
CREATE EXTENSION IF NOT EXISTS vector;
-- Verify
SELECT * FROM pg_extension WHERE extname = 'vector';
-- text-embedding-3-small produces 1536-dimensional vectors
-- claude embeddings produce 1024-dimensional vectors
CREATE TABLE documents (
id SERIAL PRIMARY KEY,
content TEXT NOT NULL,
source VARCHAR(100),
embedding VECTOR(1536),
created_at TIMESTAMP DEFAULT NOW()
);
import os
import psycopg2
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
conn = psycopg2.connect("postgresql://localhost/aiprojects")
cur = conn.cursor()
documents = [
"How do I reset my password?",
"What payment methods do you accept?",
"How do I cancel my subscription?",
"I want to upgrade my plan",
"The app is not loading on my phone",
]
for doc in documents:
response = client.embeddings.create(
model="text-embedding-3-small",
input=doc
)
embedding = response.data[0].embedding # list of 1536 floats
cur.execute(
"INSERT INTO documents (content, embedding) VALUES (%s, %s)",
(doc, embedding)
)
conn.commit()
cur.close()
conn.close()
print(f"Stored {len(documents)} documents")"
def semantic_search(query: str, limit: int = 5):
# Generate embedding for the query
response = client.embeddings.create(
model="text-embedding-3-small",
input=query
)
query_embedding = response.data[0].embedding
conn = psycopg2.connect("postgresql://localhost/aiprojects")
cur = conn.cursor()
# Cosine distance: 1 - cosine_similarity
# <=> is pgvector's cosine distance operator
cur.execute(
"""
SELECT content, 1 - (embedding <=> %s::vector) AS similarity
FROM documents
ORDER BY embedding <=> %s::vector
LIMIT %s
""",
(query_embedding, query_embedding, limit)
)
results = cur.fetchall()
cur.close()
conn.close()
return results
# Test
results = semantic_search("I forgot my login credentials")
for content, similarity in results:
print(f"{similarity:.3f} | {content}")"
<=> cosine distance (most common for text), <-> Euclidean distance, <#> negative inner product (use for normalized vectors). For text similarity, cosine distance is almost always the right choice.-- HNSW index — fast approximate nearest neighbor search
-- Build this after inserting data (can't do online updates easily)
CREATE INDEX ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
-- IVFFlat — alternative, faster to build, slightly less accurate
CREATE INDEX ON documents
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100); -- lists ≈ sqrt(row count)
faqs table with question, answer, and embedding columns.search_faqs(query) function that returns the top 3 most similar questions.VECTOR(N) column type and distance operators to PostgreSQL.<=>.<=>) is the standard operator for text embeddings.Build a full RAG (Retrieval Augmented Generation) pipeline: store embeddings of a document, retrieve the most relevant chunks for a user query, then pass those chunks to Claude to generate an answer.