Every week a new agent framework drops. LangGraph, AutoGen, CrewAI, OpenAI Agents SDK, Claude Agent SDK, Haystack, Mastra, each one promising to help you "build agents." Tutorials walk you through tool calls, chains, and memory. All of them answer the same question: how do you run an agent?
But before you pip install anything, there is a more fundamental question nobody is helping you answer:
What is an agent, structurally?
Not "an AI system that autonomously completes tasks." That definition tells you nothing. What does the internal structure look like? How do you design one? How do you explain the agent in your head to the engineer sitting across the table?
This post offers a framework for thinking about that.
Start with graph theory
Every complex system, at its core, is a graph.
Graph theory is one of the simplest ideas in mathematics: nodes + edges = graph. Nodes represent things. Edges represent relationships between them. That is it.
A city map is a graph. A social network is a graph. Your org chart is a graph. Graph theory does not care what sits inside the nodes. It only cares about how they connect.
Add rules and you get a finite state machine
A finite state machine (FSM) is graph theory made concrete. It adds two constraints:
- Nodes become "states", and the number of states is finite. You must define every possible state upfront.
- Edges become "transitions", and each transition has a condition. You can only move from one state to another when that condition is met.
A traffic light is an FSM: red, green, yellow, red. Three states, three transition conditions, on repeat forever.
The value of an FSM is predictability. You can draw every possible path before the system ever runs.
The relationship between these concepts looks like this:
An AI agent is an FSM where the LLM decides the transitions
Here is the key insight.
An AI agent is a finite state machine with one difference: the transition conditions are not hardcoded. They are determined by the LLM's output at runtime.
A traditional FSM:
An agent:
You define what states exist. The LLM decides when and where to transition.
Anthropic describes this distinction precisely in Building Effective Agents: a workflow orchestrates LLMs through predefined code paths, while an agent lets the LLM dynamically direct its own process and tool use. In FSM terms, a workflow's edges are hardcoded and an agent's edges are decided by the LLM at runtime.
This is why agents feel more "intelligent" than traditional programs. Their decision logic is driven by natural language, not by if/else branches. But the structure is still a graph you designed.
Prompts are the first-class citizen of agent design
Once you understand that an agent is an FSM, the next question is: what sits inside each state?
The answer: a prompt.
Every state maps to a prompt. The prompt defines what that state does, how the LLM should reason, and what format the output should take.
From a design perspective, prompts are the first-class citizen. Tools, memory, and output format are all expressed through the prompt:
Anthropic's Writing Tools for Agents makes the same point from a different angle. How well you write tool descriptions directly affects the LLM's decision accuracy. Tool descriptions are part of the prompt. The quality of your prompt engineering determines the quality of your agent.
Designing an agent is designing a state graph of prompts.
Two fundamental patterns
With this framework, agent architecture becomes concrete. Every agent reduces to one of two patterns, or a combination of both.
Pipeline
The sequence is fixed. A's output flows into B, B's output flows into C. This works when the process is deterministic and does not require dynamic branching. For example: draft, then grammar check, then format output.
This maps to what Anthropic calls the Prompt Chaining pattern.
Orchestrator-workers
The main prompt has global visibility and decides "what to do next." Each sub-prompt handles a single specialized task and returns its result to the orchestrator. This works when the process is dynamic and the next step depends on what happened in the previous one.
This maps to what Anthropic calls the Orchestrator-Workers pattern.
The fundamental difference between the two is who makes decisions. In a pipeline, each prompt only knows its immediate successor. The scope is narrow. An orchestrator has global context and coordinates across all prompts.
Real-world example: coding agents
Coding agents are one of the clearest illustrations of this mental model in production. Claude Code, OpenAI Codex, Cursor, Windsurf. These tools have become the fastest-growing category of AI agents, and they all follow the orchestrator-workers pattern.

A coding agent's state graph typically looks like this:
The orchestrator prompt holds the user's intent and the full context of what has been done so far. Worker prompts each handle a single capability: reading, searching, editing, testing. The LLM decides at each step. Do I need to read more code? Is the change ready to test? Did the test fail, and should I go back to editing?
What makes coding agents interesting from an architecture perspective is how many transitions they make. A single "fix this bug" request might involve dozens of state transitions. Reading files, searching for references, making edits, running tests, reading error output, editing again. Each transition is an LLM decision. Each decision is a point where the agent can go wrong.
This is why observability matters disproportionately for agents with long execution chains. The more transitions, the more places things can break.
Why this matters for observability
Here is where the mental model connects to practice.
If your agent is a state graph, then debugging it means tracing which states were visited, what each prompt produced, and which transition the LLM chose at each step. Without that visibility, you are guessing.
This is exactly what tracing solves. It captures the full execution path of your agent, prompt by prompt, tool call by tool call. When an agent takes a wrong turn, you can see the exact state where it happened and the LLM output that triggered the transition.
Structured evaluations let you test whether each prompt node in your state graph consistently produces the output you expect. And prompt management gives you version control over the prompts that define each state, so you can iterate without breaking the graph.
One sentence summary
An agent is a graph. The nodes are prompts. The edges are transition conditions. The LLM decides which edge to take at runtime. Your job is to design the graph.
Once you internalize this, conversations with your engineering team get much clearer. You do not need to know how to write code. But you can draw the graph. Label each state's responsibility, annotate each edge's condition, and define every loop's exit logic.
That is far more useful than saying "I want an agent that automatically replies to emails."

