OpenAI shipped Swarm in October 2024 as a deliberately small, deliberately experimental multi-agent framework. The whole thing was under 1000 lines of Python, the README said "not for production," and it picked up 20k+ stars anyway because the abstractions were genuinely good and the alternative was rolling your own handoff logic on top of Chat Completions.
In March 2025 OpenAI replaced it with the Agents SDK (openai-agents-python and a TypeScript counterpart). The Swarm repo's README now points users to the Agents SDK explicitly: "Swarm is now replaced by the OpenAI Agents SDK, which is a production-ready evolution of Swarm." Same handoff model, plus guardrails, tracing, sessions, hosted tools, Responses API support, and a real release cadence (the SDK shipped v0.17.1 on May 11, 2026).
If you wrote anything against Swarm in 2024 or 2025, you need to make a call: keep the prototype as-is, port to the Agents SDK, or jump to a different framework. This guide is for the port decision. For the broader landscape see best AI agent frameworks, and for the historical perspective see is OpenAI Swarm still worth using.
TL;DR
- Swarm status: experimental, superseded. Repo redirects to Agents SDK. No production support.
- Agents SDK status: production-ready, actively maintained, v0.17.1 in May 2026. 26k+ stars, 4k forks.
- Same primitives: Agent, function tools, handoffs. The mental model carries over.
- New in the SDK: guardrails (input/output validation), built-in tracing, sessions (auto conversation history), hosted tools (web search, file search, code interpreter), realtime voice agents, sandbox agents.
- Migration cost: low if you used Swarm's core abstractions. The Agent definition is almost identical; handoffs use a
handoff()helper instead of returning an Agent from a function. - Why migrate now: anything customer-facing needs guardrails and tracing. Swarm gives you neither out of the box.
What Swarm got right
Swarm's contribution was the conceptual model. Two ideas that stuck:
- Routines. An agent is a system prompt plus a set of tools (Python functions). The model decides which tool to call. Simple.
- Handoffs. A tool can return an Agent object. When the runtime sees that, it transfers control to the new agent. Multi-agent flow becomes "this tool hands off to another agent" instead of "build a graph and wire up routing."
That second idea was the unlock. Most multi-agent frameworks before Swarm (and many after) made you draw a graph: nodes are agents, edges are routing rules, and you spend most of your time arguing about whether a particular conditional is a router or an agent. Swarm flattened the graph into the agent's own toolbox: if you want to delegate, you call the delegate as a tool.
Swarm's limits were the rest of the production checklist. No tracing. No guardrails. No retries. No streaming-friendly architecture. No session management. Documentation oriented around demos, not deployment. It was a brilliant proof of concept that was never trying to be a product.
What the Agents SDK adds
The Agents SDK keeps the routines + handoffs core and adds the production layer.
Agents. Same idea as Swarm: name, instructions, tools, model. Now also configurable with output schemas, tool use settings, and per-agent guardrails.
Handoffs. Still tools that transfer control to another agent. The SDK adds a handoff() helper that wraps the receiver agent with optional input filtering and on-handoff callbacks. Cleaner than the Swarm "function returns an Agent" trick.
Guardrails. Input and output validators that run before and after the model. Block prompt injections at the boundary, validate that responses match a schema, enforce policy. None of this existed in Swarm; you wrote it yourself or you did without.
Tracing. Every agent run produces a structured trace: which agents ran, which tools they called, which handoffs happened, full prompts and outputs at each step. There is a built-in tracing UI on OpenAI's side, plus exporters to third-party observability platforms.
Sessions. Automatic conversation history management. Pass a session_id and the SDK loads prior messages and persists new ones. Swarm made you do this yourself.
Hosted tools. First-class support for OpenAI's hosted tools (web search, file search, code interpreter) without you wrapping them as functions.
Sandbox agents and realtime agents. Two specialized variants: sandbox agents run in containerized environments for long-running tasks with filesystems; realtime agents use gpt-realtime-2 for voice with the full agent feature set.
Human-in-the-loop. Built-in patterns for pausing a run, asking a human for confirmation, and resuming.
Side-by-side code
Same task in both frameworks: a triage agent that can hand off to a refund agent or a billing agent.
Swarm
from swarm import Swarm, Agent
client = Swarm()
def transfer_to_refunds():
return refund_agent
def transfer_to_billing():
return billing_agent
triage_agent = Agent(
name="Triage",
instructions="Route the user to the right specialist.",
functions=[transfer_to_refunds, transfer_to_billing],
)
refund_agent = Agent(
name="Refunds",
instructions="Process refund requests.",
functions=[],
)
billing_agent = Agent(
name="Billing",
instructions="Answer billing questions.",
functions=[],
)
response = client.run(
agent=triage_agent,
messages=[{"role": "user", "content": "I want a refund."}],
)
print(response.messages[-1]["content"])Agents SDK
from agents import Agent, Runner, handoff
refund_agent = Agent(
name="Refunds",
instructions="Process refund requests.",
)
billing_agent = Agent(
name="Billing",
instructions="Answer billing questions.",
)
triage_agent = Agent(
name="Triage",
instructions="Route the user to the right specialist.",
handoffs=[handoff(refund_agent), handoff(billing_agent)],
)
result = await Runner.run(
triage_agent,
"I want a refund.",
)
print(result.final_output)A few things to notice.
The Agents SDK does not require you to write transfer_to_* wrapper functions. The handoff() helper is the abstraction; the SDK turns each handoff into a tool the LLM can call.
Handoffs are declared on the agent, not buried inside a function. You can read the routing topology directly from the agent definitions.
Runner.run is async by default. There is a Runner.run_sync if you want the Swarm-style synchronous behavior, but production code should use the async path.
The result is a RunResult with structured access to outputs, tool calls, and the trace. Swarm just gave you the message list.
TypeScript
The Agents SDK is also available in TypeScript with the same primitives.
import { Agent, run, handoff } from "@openai/agents";
const refundAgent = new Agent({
name: "Refunds",
instructions: "Process refund requests.",
});
const billingAgent = new Agent({
name: "Billing",
instructions: "Answer billing questions.",
});
const triageAgent = new Agent({
name: "Triage",
instructions: "Route the user to the right specialist.",
handoffs: [handoff(refundAgent), handoff(billingAgent)],
});
const result = await run(triageAgent, "I want a refund.");
console.log(result.finalOutput);Swarm never had an official TypeScript port. If your stack is Node, the Agents SDK is a strict upgrade.
Migration checklist
If you have a working Swarm setup, here is the actual port:
- Install
openai-agents. New package, separate from the OpenAI SDK proper. - Convert Agents.
swarm.Agent(functions=[...])becomesagents.Agent(tools=[...]). Thefunctionsparameter is renamed totools. - Convert handoffs. Drop the
transfer_to_*wrappers. Usehandoffs=[handoff(other_agent)]on the parent agent. - Move to async.
client.run(...)becomesawait Runner.run(...). Wrap your top-level code withasyncio.runif you do not already. - Convert tool functions. Same Python signatures, but you can now annotate with
@function_toolfor richer schemas, and tools can return structured objects. - Add guardrails. This is the actual upgrade. Add an input guardrail to reject malicious prompts and an output guardrail to enforce schemas.
- Wire up tracing. Either use OpenAI's hosted tracing or export to your observability platform (see below).
- Add sessions. If you were maintaining message history manually, pass a
session_idand let the SDK do it.
Most ports I have done land at "30 to 60 minutes of work plus an evening to add guardrails." The conceptual model is unchanged, so you are mostly doing renames and adding the production hardening Swarm never had.
Guardrails and tracing in practice
The two features that justify the migration on their own.
Guardrails
from agents import Agent, Runner, InputGuardrail, GuardrailFunctionOutput
from pydantic import BaseModel
class PIICheck(BaseModel):
contains_pii: bool
reasoning: str
pii_checker = Agent(
name="PII Checker",
instructions="Determine if the input contains PII.",
output_type=PIICheck,
)
async def pii_guardrail(ctx, agent, input_text):
result = await Runner.run(pii_checker, input_text)
return GuardrailFunctionOutput(
output_info=result.final_output,
tripwire_triggered=result.final_output.contains_pii,
)
triage_agent = Agent(
name="Triage",
instructions="...",
input_guardrails=[InputGuardrail(guardrail_function=pii_guardrail)],
)A separate small agent checks the input. If it trips the wire, the main agent never runs. This pattern (a guardrail agent in front of the work agent) is the right shape for safety and policy enforcement, and Swarm gave you no way to express it.
Tracing
Out of the box, runs are traced to OpenAI's hosted UI. You can also export traces to third-party platforms. Example: routing traces to Respan's tracing.
from agents import set_tracing_disabled
import respan
respan.init(api_key=os.environ["RESPAN_API_KEY"])
# Agents SDK traces are exported automatically via OpenTelemetryOnce traces flow into a real observability platform, you get the full debugging surface: filter by agent, by tool call, by error, replay specific runs, attach evals. This is the part that turns Swarm-style "did the demo work" into "is the multi-agent system actually performant in production." See LLM Observability and LLM Tracing for the architecture.
How the Agents SDK compares to other frameworks
Quick read-out for the common alternatives in 2026.
LangGraph. Graph-based, explicit nodes and edges, very flexible, more verbose. Pick LangGraph when your topology is complex enough that the routines + handoffs model feels constraining; pick the Agents SDK when handoffs cover your needs and you want less boilerplate.
Claude Agent SDK. Anthropic's equivalent: agents, tools, sub-agents, built-in tracing through the Anthropic API. Tighter to Claude models; the Agents SDK is provider-agnostic (it works with 100+ LLMs via litellm under the hood). If you are committed to Claude, the Claude Agent SDK is the cleaner path; if you want optionality, Agents SDK.
CrewAI. Higher-level abstractions around roles and tasks. Friendlier for non-engineers, more opinionated about how agents collaborate, less ergonomic for low-level control. Different audience.
LlamaIndex Agents. Strong on RAG-heavy agents because of the LlamaIndex retrieval primitives. Use when the agent's main job is retrieval over your data.
For most teams already on the OpenAI stack, the Agents SDK is the default in 2026. The framework lock-in is low: provider-agnostic, the Agent + handoff primitives port reasonably well to other frameworks if you change your mind.
When NOT to migrate
A few cases where keeping Swarm is reasonable:
- The Swarm code is a demo that works. If it is internal-only, throwaway, and you have no plan to extend it, leave it alone.
- You vendored Swarm. Some teams forked Swarm and added their own abstractions. The port is harder; budget more time.
- You are about to leave the OpenAI ecosystem anyway. If you are planning a move to LangGraph or Claude Agent SDK, do that migration once, not twice.
For anything customer-facing, anything you need to debug in production, anything that needs guardrails: migrate.
FAQ
Is Swarm still maintained? No. The repo redirects to the Agents SDK. There are no commits to the Swarm codebase that you should rely on.
Is the Agents SDK production-ready? Yes. As of May 2026 it is on v0.17.1, has 26k+ GitHub stars, and is the framework OpenAI uses in its own examples and customer references. The API surface is stable enough to build on.
Can I use the Agents SDK with non-OpenAI models? Yes. It is provider-agnostic and supports 100+ models via litellm. Anthropic, Google, AWS Bedrock, Azure, local models, all callable.
Does the Agents SDK support streaming?
Yes. Runner.run_streamed yields incremental events as the agents run, including token deltas and tool call events.
How do I trace runs to my own observability platform? The SDK exports OpenTelemetry-compatible traces. Point your collector at it or use a vendor SDK that picks up OpenTelemetry traces natively.
Is there a port path for LangGraph users? Yes, though it is more work than Swarm to Agents SDK. The LangGraph "node is an agent" pattern maps to "agent is a handoff target" in the SDK, but explicit edge logic (conditional routing based on state) does not have a one-to-one analog and usually moves into the agent's instructions plus tools.
What is gpt-realtime-2 and how does it fit?
The voice model that powers the Agents SDK's realtime agents. Full agent feature set (tools, handoffs, guardrails) over a voice modality. Pair it with the rest of the SDK for voice agents that can hand off to text agents and back.