PHASE 3 ← Back to Course
19 / 23
🔨

Building a Simple Agent from Scratch

No frameworks needed — build a complete, production-ready agent in pure Python to truly understand how they work from the ground up.

1

What We're Building

A research agent that can autonomously:

Text — Agent Architecture Overview
┌────────────────────────────────────────────┐
│  INPUT: "Research quantum computing 2025"  │
└────────────────────┬───────────────────────┘
                     │
        ┌────────────▼────────────┐
        │  LLM DECIDES NEXT STEP  │
        │  (ReAct: Thought/Action)│
        └────────────┬────────────┘
                     │
    ┌────────────────┼────────────────┐
    │                │                │
    ▼                ▼                ▼
  SEARCH          READ_PAGE        WRITE_FILE
  web search      read URL           save output
    │                │                │
    └────────────────┼────────────────┘
                     │
         ┌───────────▼───────────┐
         │  FEEDBACK TO LLM      │
         │  (Add to messages)    │
         └───────────┬───────────┘
                     │
            Has goal been reached?
            ├─ Yes → Return answer
            └─ No → Loop back to LLM decision
2

The Agent Loop — Core Structure

The foundation is simple: a while loop that reads model output, parses tool calls, executes them, and feeds results back to the model. Everything else is implementation details.

Python — Minimal Agent Loop
import anthropic
import json

client = anthropic.Anthropic()
TOOLS = {}  # Will populate this

def run_agent(user_goal: str) -> str:
    """Run the agent loop until goal is reached."""
    messages = [
        {"role": "user", "content": user_goal}
    ]

    iteration = 0
    while iteration < 10:
        iteration += 1

        # 1. Ask the model what to do next
        response = client.messages.create(
            model="claude-opus-4-6",
            max_tokens=2048,
            tools=list(TOOLS.values()),
            messages=messages
        )

        # 2. Parse response
        if response.stop_reason == "tool_use":
            for block in response.content:
                if block.type == "tool_use":
                    # 3. Execute the tool
                    result = execute_tool(block.name, block.input)

                    # 4. Add result back to messages
                    messages.append({
                        "role": "assistant",
                        "content": response.content
                    })
                    messages.append({
                        "role": "user",
                        "content": [
                            {
                                "type": "tool_result",
                                "tool_use_id": block.id,
                                "content": result
                            }
                        ]
                    })
        else:
            # Model finished, return final answer
            return response.content[0].text

    return "Max iterations reached"
💡

Key Pattern: Tool Use Protocol

The agent doesn't directly call tools. It thinks in prose, Claude's tool_use feature identifies which tool to call, and we parse and execute it. The result goes back as a tool_result block. This keeps the LLM's reasoning readable and auditable.

3

Defining Tools — JSON Schemas and Implementations

Each tool has two parts: a JSON schema (tells the model what the tool does) and an implementation (does the work).

Python — Tool Definitions
TOOLS = {
    "web_search": {
        "name": "web_search",
        "description": "Search the internet for current information",
        "input_schema": {
            "type": "object",
            "properties": {
                "query": {
                    "type": "string",
                    "description": "Search query (e.g., 'AI agents 2025')"
                }
            },
            "required": ["query"]
        }
    },
    "read_page": {
        "name": "read_page",
        "description": "Read the text content of a web page",
        "input_schema": {
            "type": "object",
            "properties": {
                "url": {
                    "type": "string",
                    "description": "URL to read (must start with http)"
                }
            },
            "required": ["url"]
        }
    },
    "write_file": {
        "name": "write_file",
        "description": "Save text content to a file",
        "input_schema": {
            "type": "object",
            "properties": {
                "filename": {"type": "string"},
                "content": {"type": "string"}
            },
            "required": ["filename", "content"]
        }
    }
}

Implementations

Python — Tool Implementations
import requests
from bs4 import BeautifulSoup

def web_search(query: str) -> str:
    """Search Google and return top results."""
    try:
        headers = {"User-Agent": "Research Agent"}
        # Use a free search API or implement custom
        url = f"https://api.duckduckgo.com/?q={query}&format=json"
        resp = requests.get(url, headers=headers, timeout=5)
        results = resp.json()

        formatted = "Top results:\n"
        for i, r in enumerate(results.get("Results", [])[:3]):
            formatted += f"{i+1}. {r.get('Title', 'N/A')}\n"
            formatted += f"   URL: {r.get('FirstURL', 'N/A')}\n"
        return formatted
    except Exception as e:
        return f"Search failed: {str(e)}"

def read_page(url: str) -> str:
    """Read and extract text from a web page."""
    try:
        resp = requests.get(url, timeout=10)
        soup = BeautifulSoup(resp.content, "html.parser")

        # Remove script and style tags
        for tag in soup(["script", "style"]):
            tag.decompose()

        text = soup.get_text(separator="\n")
        return text[:2000]  # Limit to 2000 chars
    except Exception as e:
        return f"Failed to read page: {str(e)}"

def write_file(filename: str, content: str) -> str:
    """Write content to a file."""
    try:
        with open(filename, "w") as f:
            f.write(content)
        return f"Successfully wrote {len(content)} chars to {filename}"
    except Exception as e:
        return f"Write failed: {str(e)}"

def execute_tool(name: str, args: dict) -> str:
    """Execute a tool by name."""
    if name == "web_search":
        return web_search(args["query"])
    elif name == "read_page":
        return read_page(args["url"])
    elif name == "write_file":
        return write_file(args["filename"], args["content"])
    else:
        return "Unknown tool"
4

The System Prompt — Agent Behavior

The system prompt shapes how the agent thinks and acts. Include instructions for goal-oriented behavior, tool use guidelines, and error handling.

Text — Agent System Prompt
You are a Research Agent. Your goal is to gather current
information and synthesize it into a comprehensive report.

INSTRUCTIONS:
1. For any research topic, systematically search for
   current information using web_search
2. Read the most promising pages using read_page
3. Synthesize findings into a structured report
4. Save the final report using write_file

GUIDELINES:
- Search multiple sources to get diverse perspectives
- Prioritize recent information (2025 is current)
- When you have enough information, synthesize
  and write the report. Do not over-research.
- Be concise but thorough. Reports should be 800-1200 words.
- If a page fails to read, try another source

SUCCESS CRITERIA:
Your task is complete when you have:
1. Searched for relevant information
2. Read at least 2-3 quality sources
3. Written a comprehensive report and saved it

Always work towards completing the task efficiently.
💎

System Prompt Best Practices for Agents

Be explicit about the goal, tool usage guidelines, and success criteria. Tell the agent when to stop (avoid infinite loops). Provide constraints (e.g., "max 3 searches per topic"). Guide quality (e.g., "prioritize recent information").

5

Adding Guardrails — Safeguards for Production

A production agent needs safeguards: iteration limits, token budgets, tool approval, and error recovery.

Python — Agent with Guardrails
class GuardedAgent:
    def __init__(self):
        self.max_iterations = 10
        self.token_budget = 10000
        self.tokens_used = 0
        self.failed_tools = []

    def run(self, goal: str) -> str:
        messages = [{"role": "user", "content": goal}]

        for iteration in range(self.max_iterations):
            # Check token budget
            if self.tokens_used > self.token_budget:
                return "Token budget exceeded"

            # Call model
            response = client.messages.create(..., messages=messages)

            # Track token usage
            self.tokens_used += response.usage.input_tokens
            self.tokens_used += response.usage.output_tokens

            if response.stop_reason == "tool_use":
                for block in response.content:
                    if block.type == "tool_use":
                        # Check if tool is disabled
                        if block.name in self.failed_tools:
                            result = "Tool disabled due to repeated failures"
                        else:
                            try:
                                result = execute_tool(block.name, block.input)
                            except Exception as e:
                                result = f"Tool error: {str(e)}"
                                self.failed_tools.append(block.name)

                        messages.append({"role": "assistant", "content": ...})
                        messages.append({"role": "user", "content": ...})
            else:
                return response.content[0].text

        return "Max iterations reached"
⚠️

Common Pitfalls

Infinite loops: Always set max_iterations. Cost explosions: Track token usage. Tool cascades: One failing tool causes a chain of failures. Add circuit breakers. Silent failures: Log everything. You'll debug with logs, not hunches.

6

Running the Complete Agent

Here's how to run the agent on a real task:

Python — Running the Agent
if __name__ == "__main__":
    agent = GuardedAgent()

    goal = """Research AI agents in 2025.
Search for:
1. Latest agent frameworks and libraries
2. New agent applications in production
3. Recent breakthroughs or challenges

Synthesize findings into a comprehensive report
and save to 'ai_agents_report_2025.md'."""

    result = agent.run(goal)
    print(f"Agent completed. Tokens used: {agent.tokens_used}")
    print(f"\nFinal output:\n{result}")
💡

Expected Output

The agent will search for "AI agents 2025", read pages, synthesize findings, and save a markdown report. On subsequent iterations, it adds depth by searching for specific frameworks, production use cases, and recent papers. It stops when confident it has answered the research question.

7

Testing & Debugging Tips

Agents are notoriously hard to debug because behavior emerges from the LLM. Here are practical strategies:

📋

Log Everything

Print every tool call, result, and message exchange. You'll understand agent behavior through logs.

🔍

Single Tool First

Test each tool independently before integrating. Verify web_search works, then read_page, then write.

Use Smaller Models

Start with smaller, faster models (Claude 3.5 Sonnet) to iterate quickly. Switch to Opus for final runs.

📊

Measure Success

Define explicit success criteria. Did it search? Did it read pages? Did it save the file? Check each.

Python — Logging Agent Execution
import logging

logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger(__name__)

def run_agent_with_logging(goal):
    messages = [{"role": "user", "content": goal}]
    logger.info(f"Agent goal: {goal}")

    for i in range(10):
        logger.debug(f"Iteration {i}: {len(messages)} messages in context")

        response = client.messages.create(...)
        logger.debug(f"Response stop_reason: {response.stop_reason}")

        for block in response.content:
            if block.type == "tool_use":
                logger.info(f"Tool call: {block.name} with {block.input}")
                result = execute_tool(block.name, block.input)
                logger.info(f"Tool result: {result[:200]}...")

            elif block.type == "text":
                logger.debug(f"Text response: {block.text[:100]}...")

    logger.info("Agent completed")

Check Your Understanding

Quick Quiz — 4 Questions

1. What does the agent loop do each iteration?

2. Why do tools need both a JSON schema and an implementation?

3. What's the purpose of max_iterations guardrail?

4. How does the agent know when to stop?

Topic 15 Summary

You've now built a production-quality agent from scratch. Key takeaways: The agent loop is simple: ask LLM what to do → execute tool → feed result back → repeat. Tools need two parts: a JSON schema (for the LLM) and an implementation (to do the work). System prompts guide agent behavior — be explicit about goals, tool use, and success criteria. Guardrails are essential: max_iterations, token budgets, circuit breakers, error logging. Debugging agents requires logging everything. You understand agent behavior through its trace. This foundation lets you use frameworks like LangChain or CrewAI with deep understanding.

Next up → Topic 16: Multi-Agent Systems
Now you'll orchestrate multiple specialized agents working together on complex problems.

← Topic 14 Topic 19 of 23 Topic 16 →