Branch: MCP & RAG — Raglan AI

01 / The core problem

Why context windows aren't enough

Every AI model has a context window — a limit on how much text it can hold in "working memory" at once. GPT-4 can hold roughly 128,000 tokens. Claude can hold 200,000. Sounds huge until you realise a single company's document library is millions of tokens. A café with three years of supplier emails has more context than any model can hold.

So the simple question "what did we pay for coffee beans last October?" is unanswerable if you just paste the AI into your inbox. The context overflows. The AI can't see it all.

RAG and MCP are two different solutions to this problem — and together they're what separates a toy chatbot from a genuinely useful agent.

The bottleneck

A model with a 200k token context window can hold roughly 150,000 words — about two full novels. Your business data, emails, documents, and records almost certainly exceed this. RAG solves the "what does it know" problem. MCP solves the "what can it do" problem.

02 / RAG

Retrieval-Augmented Generation

RAG is a technique that gives an AI model access to a large knowledge base — your documents, your data — without needing to fit it all in the context window at once. Instead of loading everything, it searches first, then reads only the relevant parts.

The name describes the process: Retrieval (find the right documents) → Augmented (inject them into the prompt) → Generation (the AI answers using what it found). It's like giving an AI a library card instead of memorising every book.

❓Your questionuser asks

→

🔍Searchfind relevant docs

→

📄Injectadd to prompt

→

🧠AI answersusing real data

The "search" step typically uses vector similarity — your documents are converted into mathematical representations (embeddings) and stored in a vector database. When you ask a question, it's also converted into an embedding, and the database finds the documents whose embeddings are closest to your question. It's semantic search, not keyword search — it finds meaning, not just matching words.

📚

Knowledge base

Any text you want the AI to know: PDFs, emails, notes, menus, product specs, FAQs. Chunks are indexed into a vector database.

🗂️

Vector store

Stores document embeddings for fast semantic search. Common options: Pinecone, Chroma, Supabase pgvector, Qdrant. Many have free tiers.

🔎

Retrieval

The question is embedded and matched against stored chunks. Top N most relevant chunks are pulled out and fed to the model as context.

✍️

Grounded answer

The AI answers using real documents — not guessing from training data. It can cite sources and hallucination drops dramatically.

Raglan use case: café knowledge bot

Upload your menu, supplier list, allergen info, and opening hours as text files. A RAG-powered chatbot can now answer "what's gluten-free on the menu?", "who supplies our oat milk?", or "are you open Good Friday?" — correctly, every time, from your actual documents.

// Conceptual RAG flow (Python pseudocode)

// 1. Index your documents (run once)
documents = load("menu.pdf", "suppliers.txt", "faq.md")
chunks = split_into_chunks(documents, size=500)
embeddings = embed(chunks)           # converts text → numbers
vector_db.store(chunks, embeddings)   # saved for later

// 2. Answer a question (run every time)
question = "Is the banana bread vegan?"
q_embedding = embed(question)
relevant_chunks = vector_db.search(q_embedding, top_k=3)

prompt = f"Answer using this context: {relevant_chunks}\n\nQuestion: {question}"
answer = claude.ask(prompt)           # grounded in real data

03 / MCP

Model Context Protocol

While RAG gives your AI knowledge, MCP gives it hands. The Model Context Protocol — published by Anthropic in November 2024 — is an open standard that lets AI models connect to external tools and data sources in a consistent, interoperable way.

Before MCP, every developer had to build their own custom integration to connect an AI to, say, a Google Calendar or a database. Each integration was a one-off. MCP standardises this — if a service offers an MCP server, any AI that supports MCP can connect to it immediately. It's like a USB standard for AI tools.

🧠

Your AI Agent

Claude, Gemini, GPT-4 — any MCP-compatible model

── MCP connections ──

📅

Google CalendarCheck and create events

📊

Google SheetsRead & write rows

🌐

Web searchLive search results

📦

Your databaseQuery any SQL/NoSQL

📧

Email / SlackSend messages

🔧

Custom toolsAny API you build

When you use Claude Code (covered in another branch), every tool it has — reading files, running terminal commands, searching the web — is an MCP connection. When Claude Code says "let me check your package.json" and actually reads the file, that's MCP at work.

MCP is also why AI assistants in 2026 can book things, update spreadsheets, and trigger workflows — not just answer questions. The "tool use" era of AI runs on protocols like MCP.

04 / Choosing the right tool

RAG vs MCP — when to use which

RAG and MCP solve different problems. In practice, agents often use both. Here's how to think about the distinction:

RAG — Retrieval

For reading static knowledge

Document libraries, PDFs, wikis
Knowledge that doesn't change minute-to-minute
Answering questions from your own data
Reducing hallucination with cited sources
Works offline once indexed
Example: "What does our warranty say about water damage?"

MCP — Tools

For taking real-world actions

Live data (weather, bookings, stock)
Creating, updating, or deleting things
Triggering workflows in other systems
Reading data that changes constantly
Requires network connection to tool
Example: "Book a lesson for Wednesday 10am"

A real Raglan example: surf school agent

RAG handles: "What's your refund policy?" (read from a static policy doc)
MCP handles: "Book me in for Saturday 9am" (write to the booking Google Sheet)
Both together: an agent that knows your business and can actually do things in it.

05 / Getting started

Where to begin without a developer

You don't need to build RAG or MCP from scratch. The ecosystem has matured fast — here are practical entry points for each level.

1

Try RAG in Notebook LM (free, no code)

Google's NotebookLM lets you upload PDFs and documents, then chat with them using RAG under the hood. Upload your menu, FAQs, or any document and ask questions. It's the fastest way to experience what RAG feels like as a user.
2

Use MCP tools in Claude Code

Claude Code already has MCP connections to your file system, terminal, and (with setup) the web. When you use Claude Code to build a project, you're already working with MCP — just without knowing it by name.
3

Build a simple RAG pipeline with Supabase

Supabase (free tier) has built-in vector search with pgvector. Upload documents, run the embedding step with the OpenAI or Anthropic embedding API, and you have a searchable knowledge base. Plenty of copy-paste tutorials exist for this exact stack.
4

Add MCP servers to your agent

The MCP registry at modelcontextprotocol.io lists hundreds of ready-made MCP servers — Google Drive, GitHub, Slack, databases, and more. Install one and your Claude instance can read/write those systems immediately.

06 / Your turn

Design your knowledge architecture

The most valuable exercise isn't writing code — it's deciding what your agent needs to know and what it needs to be able to do. Answer these questions as if you were briefing a developer to build it.

Knowledge & Tool Architecture Canvas

Think about a business or project you care about. What would make an AI agent genuinely useful for it?

RAG knowledge base — what documents would you give it? MCP tools — what real-world actions should it be able to take? What should it NOT be allowed to do? (boundaries)

Saved. Use this as a brief if you ever commission someone to build it — or feed it to Claude Code as a starting point.

🧠

You understand RAG and MCP

Two foundational concepts that separate shallow AI use from genuinely powerful agents. Most people who talk about AI have never heard of either.

⚖️ Ethics 💰 Economics 🎨 Creative Tools 🔬 Google AI Studio ⚡ Automation 🐙 Git & GitHub 💻 Claude Code 🎙️ Voice Agents 🛡️ Security 🔗 MCP & RAG ← you are here 🗂️ Admin & Ops 👥 HR & Comms 💬 Customer Service 📊 Finance 🍎 Education 🔥 Prompt Engineering 🏠 Workshop Home