Getting Started with OpenAI Agents SDK

In the previous tutorials, we built an agent from scratch and then explored smolagents, Hugging Face’s minimalist framework. Now let’s look at another popular option: OpenAI’s Agents SDK.

The OpenAI Agents SDK is a lightweight framework for building agentic applications. Despite the name, it works with any LLM provider through LiteLLM integration. It offers a clean API for tools, multi-agent orchestration, and built-in features like streaming and sessions.

By the end of this tutorial, you’ll be able to:

Create agents with custom tools
Use non-OpenAI models (including Hugging Face models)
Get structured outputs with Pydantic
Build multi-agent systems with handoffs
Persist conversations with sessions
Stream responses in real-time

Let’s get started.

Setup

Install the SDK:

pip install openai-agents

For non-OpenAI models, install the LiteLLM extension:

pip install "openai-agents[litellm]"

Set up your API key:

import os
import getpass

os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter your OpenAI API key: ")

Your First Agent

Creating an agent is straightforward:

from agents import Agent, function_tool, Runner

@function_tool
def get_weather(city: str) -> str:
    """Returns weather info for the specified city."""
    return f"The weather in {city} is sunny"

agent = Agent(
    name="Haiku agent",
    instructions="Always respond in haiku form",
    model="gpt-4o-mini",
    tools=[get_weather],
)

result = await Runner.run(agent, "What's the weather in New York?")
print(result.final_output)

Output:

Sun warms glass and stone,
Blue sky folds the city bright—
Sunny streets hum life.

The @function_tool decorator works similarly to smolagents’ @tool. It parses the docstring and type hints to generate the tool schema automatically.

Using Non-OpenAI Models

One of the SDK’s strengths is model flexibility. You can use Hugging Face models through the LiteLLM integration:

import os
import getpass

os.environ["HF_TOKEN"] = getpass.getpass("Enter your Hugging Face token: ")

from agents import Agent, Runner, ModelSettings
from agents.extensions.models.litellm_model import LitellmModel

model = LitellmModel(
    model="huggingface/novita/MiniMaxAI/MiniMax-M2.1",
    api_key=os.environ["HF_TOKEN"],
)

agent = Agent(
    name="HF agent",
    instructions="Always respond in haiku form",
    tools=[get_weather],
    model=model,
    model_settings=ModelSettings(include_usage=True),
)

result = await Runner.run(agent, "What's the weather in New York?")
print(result.final_output)

The model string follows the pattern: huggingface/<provider>/<org>/<model>. This lets you use any model available through Hugging Face’s Inference Providers.

Structured Output

Need typed responses? Use Pydantic models with output_type:

from pydantic import BaseModel
from agents import Agent, Runner

class CalendarEvent(BaseModel):
    name: str
    date: str
    participants: list[str]

model = LitellmModel(
    model="huggingface/novita/zai-org/GLM-4.7",
    api_key=os.environ["HF_TOKEN"],
)

agent = Agent(
    name="Calendar extractor",
    instructions="Extract calendar events from text",
    output_type=CalendarEvent,
    model=model,
)

result = await Runner.run(
    agent,
    "Extract the event: 'Meeting with Alice and Bob on July 5th.'",
)

print(result.final_output)

Output:

name='Meeting with Alice and Bob' date='July 5th' participants=['Alice', 'Bob']

When using non-OpenAI models, make sure they support both structured output AND tool calling.

Multi-Agent Systems

The SDK supports two main multi-agent architectures:

Handoffs (Decentralized)

Agents hand off control to specialized peers:

from agents import Agent

history_tutor = Agent(
    name="History Tutor",
    handoff_description="Specialist agent for historical questions",
    instructions="You provide assistance with historical queries.",
    model=model,
)

math_tutor = Agent(
    name="Math Tutor",
    handoff_description="Specialist agent for math questions",
    instructions="You help with math problems. Explain your reasoning step by step.",
    model=model,
)

triage_agent = Agent(
    name="Triage Agent",
    instructions="Determine which agent to use based on the user's question",
    handoffs=[history_tutor, math_tutor],
    model=model,
)

result = await Runner.run(
    triage_agent,
    "Can you explain the causes of World War II?",
)

print(result.final_output)

The triage agent analyzes the query and hands off to the history tutor, which then takes over the conversation.

Agents as Tools (Centralized)

A manager orchestrates sub-agents as tools:

manager = Agent(
    name="Manager Agent",
    instructions="Manage a team of agents to answer questions effectively.",
    tools=[
        history_tutor.as_tool(
            tool_name="history_tutor",
            tool_description="Handles historical queries",
        ),
        math_tutor.as_tool(
            tool_name="math_tutor",
            tool_description="Handles math questions",
        ),
    ],
    model=model,
)

result = await Runner.run(
    manager,
    "Can you explain the causes of World War II?",
)

print(result.final_output)

The difference? With handoffs, the specialist agent takes over completely. With tools, the manager stays in control and integrates the sub-agent’s response.

Built-in Tools

The SDK includes several pre-built tools when using OpenAI models:

from agents import Agent, Runner, WebSearchTool

agent = Agent(
    name="Assistant",
    tools=[WebSearchTool()],
)

result = await Runner.run(
    agent,
    "Who is the current president of the United States?"
)
print(result.final_output)

Available built-in tools:

WebSearchTool - Search the web
FileSearchTool - Search OpenAI Vector Stores
ComputerTool - Automate computer use tasks
CodeInterpreterTool - Execute code in a sandbox
ImageGenerationTool - Generate images from prompts
LocalShellTool - Run shell commands locally

Custom Tools

Creating custom tools is simple with the @function_tool decorator:

from typing_extensions import TypedDict, Any
from agents import Agent, FunctionTool, RunContextWrapper, function_tool


class Location(TypedDict):
    lat: float
    long: float

@function_tool
async def fetch_weather(location: Location) -> str:
    """Fetch the weather for a given location.

    Args:
        location: The location to fetch the weather for.
    """
    return f"The weather at {location['lat']}, {location['long']} is sunny"


@function_tool(name_override="fetch_data")
def read_file(ctx: RunContextWrapper[Any], path: str, directory: str | None = None) -> str:
    """Read the contents of a file.

    Args:
        path: The path to the file to read.
        directory: The directory to read the file from.
    """
    return "Hello, World!"


agent = Agent(
    name="Assistant",
    tools=[fetch_weather, read_file],
    model=model,
)

Key features:

Type hints define the parameter schema
Docstrings become the tool description
name_override lets you customize the tool name
Tools can be sync or async
Access runtime context via RunContextWrapper

Sessions

Persist conversations across multiple interactions:

from agents import Agent, Runner, SQLiteSession

agent = Agent(
    name="Assistant",
    instructions="Reply very concisely.",
    model=model,
)

# Create a session
session = SQLiteSession(session_id="conv_123")

# First message
result = await Runner.run(
    agent,
    "What city is the Golden Gate Bridge in?",
    session=session
)
print(result.final_output)  # "San Francisco"

# Continue the conversation
result = await Runner.run(
    agent,
    "What state is it in?",
    session=session
)
print(result.final_output)  # "California"

The session stores the conversation history, so the agent remembers context between calls.

Streaming

For real-time responses, use run_streamed:

import asyncio
from openai.types.responses import ResponseTextDeltaEvent
from agents import Agent, Runner

async def main():
    agent = Agent(
        name="Joker",
        instructions="You are a helpful assistant.",
        model=model,
    )

    result = Runner.run_streamed(agent, input="Tell me 5 jokes.")
    async for event in result.stream_events():
        if event.type == "raw_response_event" and isinstance(event.data, ResponseTextDeltaEvent):
            print(event.data.delta, end="", flush=True)

asyncio.run(main())

This prints each token as it arrives, giving users immediate feedback.

Comparison with smolagents

Both frameworks do similar things, but have different philosophies:

Feature	OpenAI Agents SDK	smolagents
Tool decorator	`@function_tool`	`@tool`
Multi-agent	Handoffs + Tools	Manager patterns
Sessions	Built-in SQLite	Manual
Hub integration	None	Hugging Face Hub
UI	None	Built-in Gradio

Choose smolagents if you want tight Hugging Face integration and instant UIs. Choose OpenAI Agents SDK if you want a clean async API with built-in session management.

Recap

The OpenAI Agents SDK provides:

Clean API: Simple decorators and async/await patterns
Model flexibility: Works with any LLM through LiteLLM
Structured output: Native Pydantic support
Multi-agent: Both handoff and tool-based orchestration
Sessions: Built-in conversation persistence
Streaming: Real-time response streaming

Combined with what you learned in the previous tutorials, you now have a solid toolkit for building AI agents with different frameworks.

Full Code

import os
import getpass
from agents import Agent, Runner, function_tool, ModelSettings, SQLiteSession
from agents.extensions.models.litellm_model import LitellmModel

# Setup
os.environ["HF_TOKEN"] = getpass.getpass("Enter your Hugging Face token: ")

# Initialize model
model = LitellmModel(
    model="huggingface/novita/zai-org/GLM-4.7",
    api_key=os.environ["HF_TOKEN"],
)

# Custom tool
@function_tool
def get_weather(city: str) -> str:
    """Returns weather info for the specified city."""
    return f"The weather in {city} is sunny"

# Multi-agent setup
history_tutor = Agent(
    name="History Tutor",
    handoff_description="Specialist for historical questions",
    instructions="Provide assistance with historical queries.",
    model=model,
)

math_tutor = Agent(
    name="Math Tutor",
    handoff_description="Specialist for math questions",
    instructions="Help with math problems. Explain reasoning step by step.",
    model=model,
)

triage_agent = Agent(
    name="Triage Agent",
    instructions="Determine which agent to use based on the question",
    handoffs=[history_tutor, math_tutor],
    model=model,
)

# Run with session
session = SQLiteSession(session_id="demo_session")

result = await Runner.run(
    triage_agent,
    "What were the main causes of World War II?",
    session=session
)

print(result.final_output)