Day 4 of 5
⏱ ~60 minutes
React + AI — Day 4

RAG With File Uploads: Chat With Your Documents

RAG — Retrieval Augmented Generation — lets users upload documents and ask questions about them. Today you'll add PDF and text file uploads, extract the content, and use it as context for Claude's responses.

The RAG Pattern

Full vector-search RAG is complex. For a React app handling single documents, a simpler approach works well: extract the document text and include it directly in the system prompt. Claude's 200K context window handles surprisingly large documents.

Flow: user uploads file → extract text on the server → store in memory → include as context on every subsequent message.

File Upload UI in React

jsx — FileUpload.jsx
function FileUpload({ onUpload }) {
  const [isDragging, setIsDragging] = useState(false);
  const [uploadedFile, setUploadedFile] = useState(null);

  async function handleFiles(files) {
    const file = files[0];
    if (!file) return;

    const formData = new FormData();
    formData.append('file', file);

    try {
      const res = await fetch('/api/upload', {
        method: 'POST',
        body: formData  // no Content-Type header — let browser set it
      });
      const data = await res.json();
      setUploadedFile(file.name);
      onUpload(data.documentId);
    } catch (err) {
      console.error(err);
    }
  }

  return (
    
{ e.preventDefault(); setIsDragging(true); }} onDragLeave={() => setIsDragging(false)} onDrop={(e) => { e.preventDefault(); setIsDragging(false); handleFiles(e.dataTransfer.files); }} > {uploadedFile ?

📄 {uploadedFile} — ready to chat

:

Drop a PDF or text file here, or

}
); }

Server-Side Text Extraction

bash
npm install multer pdf-parse
javascript — server.js additions
const multer = require('multer');
const pdfParse = require('pdf-parse');

const upload = multer({ storage: multer.memoryStorage() });
const documents = new Map();  // in-memory storage (use DB in production)

app.post('/api/upload', upload.single('file'), async (req, res) => {
  const file = req.file;
  let text = '';

  if (file.mimetype === 'application/pdf') {
    const data = await pdfParse(file.buffer);
    text = data.text;
  } else {
    text = file.buffer.toString('utf-8');
  }

  // Truncate to ~100K characters if very large
  const truncated = text.slice(0, 100000);

  const id = crypto.randomUUID();
  documents.set(id, { name: file.originalname, content: truncated });

  res.json({ documentId: id, charCount: truncated.length });
});

Using the Document as Context

javascript — updated chat endpoint
app.post('/api/chat', async (req, res) => {
  const { messages, documentId } = req.body;

  let system = 'You are a helpful assistant.';

  if (documentId && documents.has(documentId)) {
    const doc = documents.get(documentId);
    system = `You are a helpful assistant. The user has uploaded a document called "${doc.name}". 
    
Use the following document content to answer their questions:


${doc.content}


Answer questions based on this document. If the answer isn't in the document, say so clearly.`;
  }

  const response = await client.messages.create({
    model: 'claude-3-haiku-20240307',
    max_tokens: 1024,
    system: system,
    messages: messages,
  });

  res.json({ content: response.content[0].text });
});
💡
For documents larger than ~100K characters, use chunking and vector search (pgvector from Day 4 of the PostgreSQL course). For most documents — reports, contracts, papers — the direct-context approach works well and is much simpler to implement.
📝 Exercise
Build the Document Chat Feature
  1. Add the FileUpload component to your React app.
  2. Add the /api/upload endpoint with multer and pdf-parse.
  3. Store the document in memory and return a documentId.
  4. Update the chat endpoint to accept a documentId and include the document in the system prompt.
  5. Pass the documentId with every chat request after upload.
  6. Test by uploading a PDF and asking questions about its content.

Lesson Summary

  • For single-document RAG, inject the document text directly into the system prompt — simple and effective for most use cases.
  • Use multer on the server to handle multipart file uploads.
  • pdf-parse extracts text from PDFs. Plain text and markdown files can be read directly as UTF-8.
  • Pass a documentId with each chat request so the server knows which document to include as context.
Challenge

Add a document preview panel that shows the first 500 characters of the uploaded document. Add a 'Clear Document' button that removes the context so the user can chat without it.

Finished this lesson?