Course OverviewAll CoursesBlogReserve Bootcamp Seat
LangChain · Day 2 of 5~75 minutes

Memory and Conversation — Stateful AI Apps

Build a chatbot that remembers context. Learn how conversation history works in LangChain and how to manage memory as conversations grow.

Day 1
2
Day 2
3
Day 3
4
Day 4
5
Day 5
What You'll Build Today

A multi-turn chatbot that maintains conversation history — ask it something, follow up with "what did you just say?" and it knows. A terminal-based chat loop with full context management.

1
How Memory Works

Why LLMs are stateless by default

Every API call to an LLM is independent. The model doesn't remember your last message. To build a chatbot, you have to manually include conversation history in every request. LangChain's memory classes automate this.

In modern LangChain, the cleanest approach is to manage history yourself using ChatMessageHistory and include it in the prompt:

pythonchatbot.py
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, AIMessage, SystemMessage

model = ChatOpenAI(model="gpt-4o-mini")

# Manually manage history
history = [
    SystemMessage(content="You are a helpful AI assistant. Be concise.")
]

def chat(user_input: str) -> str:
    history.append(HumanMessage(content=user_input))
    response = model.invoke(history)
    history.append(AIMessage(content=response.content))
    return response.content

# Chat loop
print("Chat started. Type 'quit' to exit.\n")
while True:
    user = input("You: ")
    if user.lower() == 'quit': break
    print(f"AI: {chat(user)}\n")

Run it and test: ask "What is LangChain?" then follow up with "What framework did you just mention?" — it remembers because the entire history is included in each API call.

2
RunnableWithMessageHistory

LangChain's Built-in Memory Wrapper

For production apps, LangChain provides RunnableWithMessageHistory — it automatically manages history storage per session:

pythonchatbot_lcel.py
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_community.chat_message_histories import ChatMessageHistory

model = ChatOpenAI(model="gpt-4o-mini")

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    MessagesPlaceholder(variable_name="history"),  # injects history here
    ("human", "{input}")
])

chain = prompt | model

# Store sessions in memory (use Redis/DB in production)
store = {}

def get_session_history(session_id: str):
    if session_id not in store:
        store[session_id] = ChatMessageHistory()
    return store[session_id]

chain_with_history = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="input",
    history_messages_key="history"
)

# Each session_id maintains its own history
config = {"configurable": {"session_id": "user_123"}}

r1 = chain_with_history.invoke({"input": "My name is Bo."}, config=config)
r2 = chain_with_history.invoke({"input": "What's my name?"}, config=config)
print(r2.content)  # "Your name is Bo."

Session IDs let you run multiple conversations independently. In a web app, use the user's ID or session token. In a script, any unique string works.

3
Trimming History

Managing Context Window Limits

Long conversations eat your context window and cost money. You need to trim history. The simplest approach: keep only the last N messages.

python
from langchain_core.messages import trim_messages

# Keep last 10 messages (5 turns)
trimmer = trim_messages(
    max_tokens=2000,
    strategy="last",
    token_counter=model,
    include_system=True,
    allow_partial=False
)

# Insert trimmer into the chain
chain_with_trim = (
    RunnablePassthrough.assign(messages=lambda x: trimmer.invoke(x["messages"]))
    | prompt | model
)

Day 2 Complete — What You Learned

  • Why LLMs are stateless and how to add memory manually
  • Built a multi-turn chat loop with manual history
  • Used RunnableWithMessageHistory for session-based memory
  • Managed context limits with trim_messages
Course progress
40%
Day 2 Done

Tomorrow: RAG — query your documents

Day 3 is the most in-demand LangChain skill — building retrieval-augmented generation pipelines that answer questions from your own documents.

Day 3: RAG Pipeline
Finished this lesson?