Day 4 of 5
⏱ ~60 minutes
PostgreSQL for Developers — Day 4

pgvector: Store and Search AI Embeddings in PostgreSQL

Vector databases are a hot topic in AI. But if you already have PostgreSQL, you don't need a separate vector database. The pgvector extension adds native vector storage and similarity search directly to Postgres.

What Are Embeddings?

An embedding is a list of numbers (a vector) that represents the meaning of a piece of text. Texts with similar meanings have vectors that are close together in space. This is how semantic search works — instead of matching keywords, you match meaning.

For example, "How do I cancel my subscription?" and "I want to stop paying for this service" are very different strings but have very similar embeddings — so a semantic search finds them both when you search for subscription cancellation.

Install pgvector

bash
# macOS with Homebrew
brew install pgvector

# Ubuntu
sudo apt-get install postgresql-16-pgvector

# Or build from source
git clone https://github.com/pgvector/pgvector.git
cd pgvector && make && sudo make install
sql
-- Enable in your database
CREATE EXTENSION IF NOT EXISTS vector;

-- Verify
SELECT * FROM pg_extension WHERE extname = 'vector';

Create a Table With a Vector Column

sql
-- text-embedding-3-small produces 1536-dimensional vectors
-- claude embeddings produce 1024-dimensional vectors
CREATE TABLE documents (
    id          SERIAL PRIMARY KEY,
    content     TEXT NOT NULL,
    source      VARCHAR(100),
    embedding   VECTOR(1536),
    created_at  TIMESTAMP DEFAULT NOW()
);

Generate and Store Embeddings

python
import os
import psycopg2
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

conn = psycopg2.connect("postgresql://localhost/aiprojects")
cur = conn.cursor()

documents = [
    "How do I reset my password?",
    "What payment methods do you accept?",
    "How do I cancel my subscription?",
    "I want to upgrade my plan",
    "The app is not loading on my phone",
]

for doc in documents:
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=doc
    )
    embedding = response.data[0].embedding  # list of 1536 floats

    cur.execute(
        "INSERT INTO documents (content, embedding) VALUES (%s, %s)",
        (doc, embedding)
    )

conn.commit()
cur.close()
conn.close()
print(f"Stored {len(documents)} documents")"

Semantic Search With Cosine Similarity

python
def semantic_search(query: str, limit: int = 5):
    # Generate embedding for the query
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=query
    )
    query_embedding = response.data[0].embedding

    conn = psycopg2.connect("postgresql://localhost/aiprojects")
    cur = conn.cursor()

    # Cosine distance: 1 - cosine_similarity
    # <=> is pgvector's cosine distance operator
    cur.execute(
        """
        SELECT content, 1 - (embedding <=> %s::vector) AS similarity
        FROM documents
        ORDER BY embedding <=> %s::vector
        LIMIT %s
        """,
        (query_embedding, query_embedding, limit)
    )
    results = cur.fetchall()
    cur.close()
    conn.close()
    return results

# Test
results = semantic_search("I forgot my login credentials")
for content, similarity in results:
    print(f"{similarity:.3f} | {content}")"
ℹ️
The three distance operators: <=> cosine distance (most common for text), <-> Euclidean distance, <#> negative inner product (use for normalized vectors). For text similarity, cosine distance is almost always the right choice.

Add an Index for Fast Search

sql
-- HNSW index — fast approximate nearest neighbor search
-- Build this after inserting data (can't do online updates easily)
CREATE INDEX ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

-- IVFFlat — alternative, faster to build, slightly less accurate
CREATE INDEX ON documents
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);  -- lists ≈ sqrt(row count)
📝 Exercise
Build a Semantic FAQ Search
  1. Install pgvector and enable it in your database.
  2. Create a faqs table with question, answer, and embedding columns.
  3. Write a Python script that generates embeddings for 10 FAQ questions and stores them.
  4. Write a search_faqs(query) function that returns the top 3 most similar questions.
  5. Test it with queries that don't exactly match any stored question but are semantically related.
  6. Add an HNSW index and verify the search still returns correct results.

Lesson Summary

  • pgvector adds a VECTOR(N) column type and distance operators to PostgreSQL.
  • Generate embeddings with an AI API (OpenAI, Cohere, etc.), store them as vectors, query with <=>.
  • HNSW indexes make similarity search fast at scale — add them after inserting data.
  • Cosine distance (<=>) is the standard operator for text embeddings.
Challenge

Build a full RAG (Retrieval Augmented Generation) pipeline: store embeddings of a document, retrieve the most relevant chunks for a user query, then pass those chunks to Claude to generate an answer.

Finished this lesson?