Course OverviewAll CoursesBlogReserve Bootcamp Seat
LangChain · Day 3 of 5~90 minutes

RAG with LangChain — Document Loaders, Vector Stores, Retrieval

Build a complete RAG pipeline: load documents, split them into chunks, embed into a vector store, and retrieve relevant context to answer questions from your own data.

Day 1
Day 2
3
Day 3
4
Day 4
5
Day 5
What You'll Build Today

A document Q&A system — load a PDF or text file, index it into a vector store, and ask natural language questions that get answered from the document's content. This is the most deployed LangChain pattern in production.

1
Install

Install RAG dependencies

bash
pip install langchain langchain-openai langchain-community
pip install chromadb pypdf tiktoken
2
How RAG Works

The RAG Pipeline — 4 Steps

1. Load documents from files, URLs, or databases. 2. Split them into small chunks (LLMs have context limits). 3. Embed chunks into vectors and store in a vector database. 4. Retrieve the most relevant chunks at query time, inject them into the prompt.

Why chunks? A 200-page PDF is too big for one context window. Splitting it into 500-token chunks means you can retrieve just the 3-5 most relevant sections for any given question.

3
Build It

Complete RAG Pipeline

pythonrag_pipeline.py
from langchain_community.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_community.vectorstores import Chroma
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

# 1. Load a document
loader = TextLoader("my_document.txt")
docs = loader.load()

# 2. Split into chunks
splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,
    chunk_overlap=50
)
chunks = splitter.split_documents(docs)
print(f"Created {len(chunks)} chunks")

# 3. Embed and store in Chroma vector DB
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(chunks, embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 4})

# 4. Build the RAG chain
model = ChatOpenAI(model="gpt-4o-mini")

prompt = ChatPromptTemplate.from_template("""
Answer the question using only the provided context.
If the answer isn't in the context, say "I don't have that information."

Context:
{context}

Question: {question}
""")

def format_docs(docs):
    return "\n\n".join(d.page_content for d in docs)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt | model | StrOutputParser()
)

# Ask questions
answer = rag_chain.invoke("What are the main topics in this document?")
print(answer)

chunk_overlap=50 ensures sentences don't get cut off at boundaries. The last 50 tokens of each chunk overlap with the first 50 of the next, preserving context.

Day 3 Complete — What You Learned

  • How RAG works: load → split → embed → retrieve
  • Used TextLoader and RecursiveCharacterTextSplitter
  • Embedded chunks into Chroma vector store
  • Built a complete retrieval + generation chain with LCEL
Course progress
60%
Day 3 Done

Tomorrow: agents and tools

Day 4 shows how to build AI agents that reason, decide which tools to use, and take actions — not just generate text.

Day 4: Agents and Tools
Finished this lesson?