🤖 What is an agent, actually
The word "agent" has been used so loosely in AI marketing that it's become nearly meaningless. Let's cut through it.
A chatbot does one thing: you ask, it answers. The conversation ends there. An agent does something more: it receives a goal, breaks it into steps, takes actions (calling tools, reading files, browsing websites), checks the results, and loops until the goal is met. Then it reports back.
The key differences:
- Tools: an agent can use external tools — search, file read/write, APIs, code execution. A chatbot just generates text.
- Memory: an agent can store results from one step and use them in the next. A chatbot forgets between calls.
- Loops: an agent can run the same step repeatedly until a condition is met. A chatbot responds once.
- Autonomy: you give an agent a goal and walk away. You give a chatbot a question and wait for an answer.
You have already used agents. Claude Code, Cursor, and Cline — the tools from Modules 5 and 6 — are all agentic IDEs. They read your files, write code, run commands, check results, and loop. The difference between those and what you'll build here is complexity, not kind.
🔬 The anatomy of a simple agent
Every agent — no matter how sophisticated — has the same four basic components. Understanding this means you can evaluate any agent framework intelligently rather than being impressed or confused by the marketing.
The loop: Think → Act → Observe result → Think again — until done.
That's it. Everything else — multi-agent systems, memory architectures, tool registries — is elaboration on this loop. The complexity comes from what tools are available in Step 3 and how sophisticated the thinking is in Step 2. The structure is always the same.
🛠️ Your first agent — a research assistant
The simplest useful agent is a research assistant: give it a topic, it searches, reads, synthesises, and returns a structured briefing. No external tools required for the basic version — you can build one right now in any AI chat, using only prompt structure.
Paste that into Claude or ChatGPT, then follow up with a topic — anything relevant to your work or life. Watch how it structures its response differently from a normal chat answer. That structure — scope, research, synthesise, flag gaps — is the agent loop made visible in a single prompt.
Why this matters: You've just built an agent without writing a single line of code. The "agent" is the structured prompt that forces the model to loop through steps. This is the foundation everything else builds on — including the code-based agents we'll look at next.
🌐 Connecting agents to the real world
A pure prompt-based agent is powerful but limited — it can only use the knowledge baked into the model. The real capability unlocks when you connect an agent to external tools: the file system, the web, APIs, spreadsheets, email.
Here's what's actually possible right now, without needing to be a developer:
- File system: Claude Code and Cline can read and write files on your computer. Give them a goal ("refactor this folder of images into subfolders by year") and they'll do it.
- Web search: Perplexity, ChatGPT with search enabled, and Gemini with Google Search can browse the web as part of their response loop.
- Spreadsheets & docs: Gemini is natively connected to Google Sheets and Docs. ChatGPT Plus can read/write Excel. Claude can analyse uploaded spreadsheets.
- APIs: Using Google AI Studio or the Claude API (covered in the Economics branch module), you can build agents that call external APIs — weather data, CRM systems, booking platforms.
- Automation platforms: Tools like Make.com and Zapier let you wire AI into existing workflows without code — an email arrives, AI summarises it and files it, done. Covered deeper in the Automation branch module.
The practical path: Start with the file system (Claude Code, Cline) and web search (Perplexity, Gemini). These are immediately useful and need no setup beyond what you already have. Build from there.
🌿 The open-source agent landscape — an honest map
Since early 2024, GitHub has seen an explosion of open-source agent frameworks. Some have attracted enormous attention — tens of thousands of stars, viral demos, breathless write-ups. It's worth knowing what's out there and having a clear-eyed view of what it is and isn't.
- AutoGPT / AgentGPT — some of the earliest viral autonomous agent frameworks. They inspired the whole category. Both have matured but also revealed the fundamental limits of fully autonomous agents: they hallucinate, loop, and need constant supervision.
- OpenHands (formerly OpenDevin) — an open-source AI software developer. Given a task, it can write code, run tests, and iterate. Very impressive in demos. Still requires careful setup and supervision in practice.
- CrewAI / LangGraph — frameworks for building multi-agent systems: multiple AI agents that collaborate, each with a role. Powerful, but complex. Worth understanding conceptually before trying to use.
- Cline — a VS Code extension that adds agentic behaviour to your editor. More stable and practical than most, because it keeps a human in the loop for each action.
The honest take: Most of these frameworks are calling the same underlying models you already have access to — Claude, GPT-4o, Gemini. The framework is a harness, not magic. If you understand the Think → Act loop and can prompt clearly, you already understand what these frameworks are doing internally. They add complexity as much as they add capability. Use them when you have a clear problem they solve better than a direct API call — not because the GitHub star count is impressive.
🚀 Where agents are going
The trajectory of AI agents over the next few years is one of the clearest things in an otherwise unpredictable field. Three things are happening simultaneously:
- Longer context windows. Models can now hold entire codebases, books, and legal documents in a single conversation. The "memory" limitation is shrinking fast.
- Better tool use. Models are getting more reliable at calling external tools without errors, hallucinations, or unnecessary steps. Claude and GPT-4o in 2025 are dramatically more reliable tool-users than GPT-4 in 2023.
- Multi-agent coordination. The emerging frontier is AI systems where multiple agents collaborate — one plans, one researches, one writes, one reviews. Each is a specialist. Together they can handle tasks no single model could manage reliably alone.
You don't need to build multi-agent systems right now. But understanding that this is where the technology is heading helps you read the landscape clearly — and not be surprised when something that felt futuristic six months ago is suddenly a standard feature in your IDE.
The question that matters most: Not "what can AI agents do?" but "what would an agent do for my life or work that I currently spend time on?" That's what the final exercise is for.
- Grant finder: An agent that searches Creative New Zealand, Lion Foundation, and local council funding databases weekly — and emails you a summary of new opportunities relevant to your arts practice.
- Surf report digest: An agent that pulls surf forecasts, tide times, and UV ratings every morning and sends a single "worth paddling out today?" message to your phone.
- Café stocktake assistant: An agent that reads your weekly sales data and tells you which products to reorder, what's slowing down, and what seasonal items to push — before you open on Monday.
- Tourism content generator: An agent that monitors Waikato/Raglan events, then drafts Instagram posts and email newsletter segments for your accommodation or tour business — ready to review and send.
The research agent system prompt is above in Section 3. Use it now on a topic that's actually relevant to your work or life.
- Open Claude or ChatGPT.
- Paste the research agent system prompt from Section 3. Hit send.
- Follow up with a topic that's genuinely relevant to you — a business question, a local issue, a topic you're teaching, anything.
- Read the structured response. Notice the GAPS section — this is the agent telling you what it doesn't know. That honesty is a feature, not a bug.
- Ask one follow-up question based on the gaps it identified. Watch how the loop continues.
The most useful agent is the one that does a real, recurring task you currently do manually. Use the questions below to plan yours. Your answers are saved locally to this device — they're for you, not for us.
✓ Saved to this device. Open this module any time to review it.
🪞 Before you go — five-minute reflection
Eight modules is a lot of ground. The best way to make it stick is to name it in your own words. Your answers save to this browser only — they go nowhere and are for you.
Take five minutes. No right answers. Just what actually landed for you.
✓ Saved to this device. Open this module any time to find it again.
You started with "I've heard people talking about it at the school gate." You're finishing with a working build, a debug loop, a research agent, and a roadmap for the agent that solves a real problem in your life.
That's not nothing. That's actually quite a lot.
The modules will still be here. The tools you've opened still have free tiers. The build you made still exists. The next step is whichever one you actually do — not the most ambitious one, the most doable one.
All eight modules done. 🎉
Kia ora. You know more about AI than most people who work with it every day. Go build something real.
All 16 branch modules are yours to explore — pick whatever matches where you want to go next:
Working in health? There's a specialist add-on built for NZ clinicians: