How to Build an AI Agent From Scratch: A Practical Walkthrough

TL;DR:

An AI agent = LLM + tool loop; LangGraph makes the loop explicit via a state machine you control
The four building blocks: State (working memory), Nodes (where work happens), Edges (routing logic), and Tools (external capabilities)
A typical research run costs roughly $0.04–$0.08 with Claude Sonnet at current pricing

An AI agent is an LLM that can take actions in a loop, not just respond once. A standard chatbot gets a prompt and returns a completion — one round trip, done. An agent receives a goal, executes an action (search the web, call an API), observes the result, and decides what to do next — until the goal is achieved.

In this walkthrough, you’ll build a working research agent using LangGraph. Given a question, it searches the web and produces a structured JSON summary with key claims and sources. Every step produces working output — no pseudocode, no placeholder functions.

Prerequisites and Stack

You’ll need Python 3.11+, an Anthropic API key (claude-sonnet-4-5 or newer) or OpenAI API key (gpt-4o), and a Tavily API key for web search (free tier: 1,000 searches/month).

We’re using LangGraph because its explicit state model makes agents easier to reason about, debug, and extend. The control flow is visible in your code — no magic happening inside a pre-built agent class.

uv init research-agent && cd research-agent
uv add langgraph langchain-anthropic tavily-python langchain-core python-dotenv

Create a .env file with your ANTHROPIC_API_KEY and TAVILY_API_KEY.

The Four Building Blocks

Before writing code, understand the structure of every LangGraph agent.

State is a typed dictionary flowing through the entire graph — the agent’s working memory for one run. Nodes are Python functions that receive the current state and return an updated state; this is where work happens. Edges define which node runs next; conditional edges implement the “loop until done” logic. Tools are Python functions the LLM can invoke; the LLM generates a structured tool call, your executor node runs it.

Step 1: Define State and the Tool

from typing import Annotated, TypedDict
from langchain_core.messages import BaseMessage
from langchain_core.tools import tool
from langgraph.graph.message import add_messages
from tavily import TavilyClient
from dotenv import load_dotenv
import os

load_dotenv()

class AgentState(TypedDict):
    messages: Annotated[list[BaseMessage], add_messages]
    iteration_count: int

tavily = TavilyClient(api_key=os.getenv("TAVILY_API_KEY"))

@tool
def web_search(query: str) -> str:
    """Search the web for information. Returns a summary and source URLs."""
    results = tavily.search(query=query, search_depth="advanced", max_results=5, include_answer=True)
    output = f"Answer: {results.get('answer', 'No direct answer')}\n\nSources:\n"
    for r in results.get("results", []):
        output += f"- [{r['title']}]({r['url']})\n  {r['content'][:300]}...\n\n"
    return output

The docstring matters — LangGraph passes it to the LLM as the tool description. Be specific about what the tool returns.

Step 2: Build the Agent Node and Routing Logic

from langchain_anthropic import ChatAnthropic
from langchain_core.messages import AIMessage, SystemMessage

model = ChatAnthropic(model="claude-sonnet-4-5-20250514", temperature=0,
                      api_key=os.getenv("ANTHROPIC_API_KEY"))
model_with_tools = model.bind_tools([web_search])

SYSTEM_PROMPT = """You are a research agent. Answer questions using web search.
When done, produce a JSON object with "summary", "key_claims" (list), and "sources" (list of URLs).
Do not guess. Only include claims supported by search results."""

def agent_node(state: AgentState) -> dict:
    messages = state["messages"]
    if not any(isinstance(m, SystemMessage) for m in messages):
        messages = [SystemMessage(content=SYSTEM_PROMPT)] + messages
    response = model_with_tools.invoke(messages)
    return {"messages": [response], "iteration_count": state.get("iteration_count", 0) + 1}

def should_continue(state: AgentState) -> str:
    if state.get("iteration_count", 0) >= 10:
        return "end"
    last = state["messages"][-1]
    if isinstance(last, AIMessage) and last.tool_calls:
        return "tools"
    return "end"

should_continue is the core control loop — it routes to the tool executor if the LLM made tool calls, or exits if it produced a final answer.

Step 3: Wire the Graph

from langgraph.graph import StateGraph, END
from langgraph.prebuilt import ToolNode
from langgraph.checkpoint.memory import MemorySaver

tool_node = ToolNode([web_search])

workflow = StateGraph(AgentState)
workflow.add_node("agent", agent_node)
workflow.add_node("tools", tool_node)
workflow.set_entry_point("agent")
workflow.add_conditional_edges("agent", should_continue, {"tools": "tools", "end": END})
workflow.add_edge("tools", "agent")

app = workflow.compile(checkpointer=MemorySaver())

The graph shape: [START] → agent → (conditional) → tools → agent → ... → END. Every LangGraph agent follows this structure.

Running It

from langchain_core.messages import HumanMessage

def run_research_agent(question: str, thread_id: str = "default") -> str:
    config = {"configurable": {"thread_id": thread_id}}
    result = app.invoke(
        {"messages": [HumanMessage(content=question)], "iteration_count": 0},
        config=config
    )
    return result["messages"][-1].content

if __name__ == "__main__":
    print(run_research_agent("What are the main AI agent frameworks in 2026?"))

A healthy run calls web_search 2–4 times before producing a JSON final answer. A broken agent calls the same query repeatedly, or hits the iteration limit every run.

Common Failure Modes

Infinite loops — the LLM never stops calling tools. Fix this with explicit guidance in the system prompt (“After 3–4 searches, produce your final answer”) and a hard iteration guard.

Tool errors crashing the graph — wrap tool logic in try/except and return the error as a string rather than raising. The LLM can then decide to retry or fall back instead of crashing the whole run.

Prompt injection via tool results — a webpage can embed instructions like “Ignore previous instructions.” Wrap tool results in clear delimiters and truncate at a safe length (2,000 chars). Treat all tool outputs as untrusted in production.

What to Add Next

Swap MemorySaver for SqliteSaver to add persistent memory that survives process restarts — each thread_id becomes a persistent conversation. Compile with interrupt_before=["tools"] to pause before every tool execution for human review. Use app.stream(...) instead of app.invoke(...) to display tool calls and reasoning in real time.

Bottom Line

You now have a working LangGraph agent. The control loop — state flows into nodes, nodes update state, edges determine what runs next — is the entire foundation. Everything else in AI agent development is an elaboration of this pattern. Real agent work is more debugging than building; the framework handles the plumbing, your job is getting the agent to reason correctly about your specific problem.