An agent with two memory systems: short-term (conversation history within a session) and long-term (JSON file that persists facts across sessions). Plus a context window manager that summarizes old messages when the window fills up.
Short-term vs long-term memory
Short-term memory is the conversation history — the messages list we've been building since Day 1. It's in RAM. When the session ends, it's gone.
Long-term memory is a persistent store of facts the agent has learned. When the agent discovers something important — a user preference, a key fact, a result from a previous task — it writes it to a file or database. Next session, it loads that store and starts with that knowledge.
Together, they give the agent both context within a session and continuity across sessions.
Memory-enabled agent
import anthropic, json, os
from datetime import datetime
from pathlib import Path
client = anthropic.Anthropic()
MEMORY_FILE = "agent_memory.json"
MAX_MESSAGES = 20 # summarize when history exceeds this
# ── Long-term memory ───────────────────────────────
def load_memory() -> dict:
if Path(MEMORY_FILE).exists():
return json.loads(Path(MEMORY_FILE).read_text())
return {"facts": [], "preferences": {}, "task_history": []}
def save_memory(memory: dict):
Path(MEMORY_FILE).write_text(json.dumps(memory, indent=2))
def remember_fact(fact: str, category: str = "general") -> str:
memory = load_memory()
entry = {
"fact": fact,
"category": category,
"timestamp": datetime.now().isoformat()
}
memory["facts"].append(entry)
save_memory(memory)
return f"Remembered: '{fact}' (category: {category})"
def recall_facts(query: str = "") -> str:
memory = load_memory()
facts = memory["facts"]
if query:
facts = [f for f in facts
if query.lower() in f["fact"].lower()
or query.lower() in f["category"].lower()]
if not facts:
return "No facts found."
return json.dumps(facts, indent=2)
# ── Context window management ──────────────────────
def summarize_old_messages(messages: list) -> list:
"""When history is long, summarize old messages into a single system message."""
if len(messages) <= MAX_MESSAGES:
return messages
# Keep the most recent 10 messages
recent = messages[-10:]
old = messages[:-10]
# Ask Claude to summarize the old messages
summary_text = "\n".join(
f"{m['role']}: {str(m['content'])[:200]}"
for m in old
)
summary_resp = client.messages.create(
model="claude-haiku-4-5", # cheap model for summarization
max_tokens=512,
messages=[{
"role": "user",
"content": f"Summarize this conversation history in 3-5 sentences:\n{summary_text}"
}]
)
summary = summary_resp.content[0].text
# Replace old messages with a summary
summary_message = {
"role": "user",
"content": f"[Earlier conversation summary]: {summary}"
}
return [summary_message] + recent
# ── Tools (extends Day 2) ─────────────────────────
TOOLS = [
{"name":"remember_fact","description":"Store a fact in long-term memory for future sessions. Use for important info the user shares.",
"input_schema":{"type":"object","properties":{"fact":{"type":"string"},"category":{"type":"string"}},"required":["fact"]}},
{"name":"recall_facts","description":"Retrieve facts from long-term memory. Use at session start or when you need to recall past information.",
"input_schema":{"type":"object","properties":{"query":{"type":"string"}},"required":[]}},
]
def execute_tool(name, inp):
if name == "remember_fact":
return remember_fact(inp["fact"], inp.get("category","general"))
elif name == "recall_facts":
return recall_facts(inp.get("query",""))
raise ValueError(f"Unknown tool: {name}")
# ── Agent with memory ─────────────────────────────
class MemoryAgent:
def __init__(self):
self.messages = []
self.memory = load_memory()
# Inject existing long-term memory as context
if self.memory["facts"]:
facts_str = "\n".join(f"- {f['fact']}" for f in self.memory["facts"])
print(f"Loaded {len(self.memory['facts'])} facts from long-term memory.")
self._system_context = f"You know these facts from previous sessions:\n{facts_str}"
else:
self._system_context = "No previous session memory."
def chat(self, user_input: str, max_steps=10):
self.messages.append({"role":"user","content":user_input})
self.messages = summarize_old_messages(self.messages)
for _ in range(max_steps):
resp = client.messages.create(
model="claude-sonnet-4-5",
max_tokens=2048,
system=self._system_context,
tools=TOOLS,
messages=self.messages
)
if resp.stop_reason == "end_turn":
answer = resp.content[0].text
self.messages.append({"role":"assistant","content":answer})
return answer
results = []
for b in resp.content:
if b.type == "tool_use":
result = execute_tool(b.name, b.input)
results.append({"type":"tool_result","tool_use_id":b.id,"content":str(result)})
self.messages += [
{"role":"assistant","content":resp.content},
{"role":"user","content":results}
]
return "Max steps reached."
# ── Demo: multi-session memory ─────────────────────
if __name__ == "__main__":
agent = MemoryAgent()
# Session 1: tell agent something
r1 = agent.chat("My name is Alex and I prefer Python over JavaScript.")
print("Response 1:", r1)
# Session 2 (new agent instance simulates new session)
agent2 = MemoryAgent()
r2 = agent2.chat("What's my name and language preference?")
print("Response 2:", r2)
# Agent should answer correctly from long-term memory!
Run it twice: First run stores the preference. Second run (creating agent2) loads it from disk and answers correctly. That's long-term memory working.
Why context window management matters
Every message you add to the history costs tokens. Claude's context window is 200K tokens — but at $3/million tokens for Sonnet, a 100K-token conversation costs $0.30 per exchange. For production agents running hundreds of tasks per day, this adds up.
The summarization approach above keeps costs manageable: once you have more than 20 messages, summarize the oldest ones into a single compact message. You preserve the gist of the conversation without paying for every word.
More sophisticated approaches: For production, consider semantic memory — using embeddings to retrieve only the relevant facts for the current task, rather than loading everything. We use a simpler JSON approach here because it works for 80% of use cases and doesn't require a vector database.
Complete before Day 4
- Run the demo and verify cross-session memory works
- Add a
forget_fact(fact_id)tool that removes a specific memory entry - Add a
set_preference(key, value)tool that stores user preferences separately from facts - Run 3 consecutive sessions — each one should know what the previous ones stored
Tomorrow: Multi-agent systems. One agent directing others. This is where things get interesting.