Skip to main content
  1. Sign up — Create an account at platform.respan.ai
  2. Create an API key — Generate one on the API keys page
  3. Add credits or a provider key — Add credits on the Credits page or connect your own provider key on the Integrations page
Add the Docs MCP to your AI coding tool to get help building with Respan. No API key needed.
{
  "mcpServers": {
    "respan-docs": {
      "url": "https://docs.respan.ai/mcp"
    }
  }
}
This integration is for the Respan gateway.

What is OpenAI SDK?

OpenAI SDK provides the most robust integration method for accessing multiple model providers. Since most AI providers prioritize OpenAI SDK compatibility, you can seamlessly call all 250+ models available through the Respan platform gateway.

Quickstart

Step 1: Install OpenAI SDK

  • Get a Respan API key
  • Add your provider credentials
  • Install packages
pip install openai

Step 2: Initialize Client

from openai import OpenAI

client = OpenAI(
    base_url="https://api.respan.ai/api/",
    api_key="YOUR_RESPAN_API_KEY",  # Get from Respan dashboard
)

Step 3: Make Your First Request

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello, world!"}],
)
print(response.choices[0].message.content)

Step 4: See your log on platform

Switch models

# OpenAI GPT models
model = "gpt-4o"
# model = "claude-3-5-sonnet-20241022"
# model = "gemini-1.5-pro"

response = client.chat.completions.create(
    model=model,
    messages=[{"role": "user", "content": "Your message"}],
)
See the full model list for all available models.

Supported parameters

OpenAI parameters

We support all the OpenAI parameters. You can pass them directly in the request body.
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Tell me a story"}],
    temperature=0.7,          # Control randomness
    max_tokens=1000,          # Limit response length
    top_p=0.9,               # Nucleus sampling
    frequency_penalty=0.1,    # Reduce repetition
    presence_penalty=0.1,     # Encourage topic diversity
    stream=True,             # Enable streaming
)

Respan Parameters

Respan parameters can be passed for better handling and customization.
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Tell me a story"}],
    extra_body={
        "customer_identifier": "user_123",           # Track specific users
        "fallback_models": ["gpt-3.5-turbo"],       # Automatic fallbacks
        "metadata": {"session_id": "abc123"},        # Custom metadata
        "thread_identifier": "conversation_456",     # Group related messages
        "group_identifier": "team_alpha",           # Organize by groups
    }
)

Prompt composition

A variable in one prompt can reference another prompt. The child prompt is rendered first and injected into the parent. See Prompt composition for setup details.
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[],
    extra_body={
        "prompt": {
            "prompt_id": "PARENT_PROMPT_ID",
            "override": True,
            "variables": {
                "request": "dispute a charge from last month",
                "conversation": {
                    "_type": "prompt",
                    "prompt_id": "CHILD_PROMPT_ID",
                    "version": 2,
                    "variables": {
                        "customer_name": "Sarah",
                        "department": "billing"
                    }
                }
            }
        }
    },
)

Prompt schema (v1 vs v2)

The prompt object supports a schema_version field that controls how prompt configuration and request parameters are merged. See the full guide for details.
  • Prompt schema v1 (default, legacy): override flag controls which side wins for conflicts.
  • Prompt schema v2 (recommended, schema_version=2): prompt config always wins. Supports a patch field for non-message parameter overrides.
OpenAI SDKs strip fields like schema_version, patch, and prompt_slug during validation. Prompt schema v2 requires raw HTTP requests instead of the OpenAI SDK. See the Standard API examples.

Responses API

The Responses API is OpenAI’s most advanced interface for generating model responses. It supports text and image inputs, text outputs, and stateful interactions using the output of previous responses as input. Extend the model’s capabilities with built-in tools for file search, web search, computer use, and more.
This works exclusively with OpenAI models and cannot be used with models from other providers.
Pass-through Integration Limitations: This is a pass-through integration. Some Respan features are not available, including:
  • User Rate Limits: You cannot enforce rate limits on your users.
  • Fallbacks: You cannot set up fallback models.
  • Load Balancing: You cannot distribute traffic across multiple models or credentials.
  • Prompt Management: You cannot use prompts stored in Respan directly.
Pass Respan parameters via a base64-encoded header:
from base64 import b64encode
import json

respan_params = {
    "metadata": {
        "paid_user": "true",
    }
}

respan_params_header = {
    "X-Data-Respan-Params": b64encode(json.dumps(respan_params).encode()).decode(),
}

Text input

response = client.responses.create(
    model="gpt-4o",
    input="Tell me a three sentence bedtime story about a unicorn.",
    extra_headers=respan_params_header,
)
print(response)
response = client.responses.create(
    model="gpt-4o",
    tools=[
        {
            "type": "file_search",
            "vector_store_ids": ["vs_67d3bdd0c8888191adfa890a9e829480"],
            "max_num_results": 20,
        }
    ],
    input="What are the attributes of an ancient brown dragon?",
    extra_headers=respan_params_header,
)

Reasoning

response = client.responses.create(
    model="o3-mini",
    input="How much wood would a woodchuck chuck?",
    reasoning={"effort": "high"},
    extra_headers=respan_params_header,
)
print(response)

Streaming

response = client.responses.create(
    model="gpt-4o",
    instructions="You are a helpful assistant.",
    input="Hello!",
    stream=True,
    extra_headers=respan_params_header,
)

for chunk in response:
    print(chunk)

Functions

tools = [
    {
        "type": "function",
        "name": "get_current_weather",
        "description": "Get the current weather in a given location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "The city and state, e.g. San Francisco, CA",
                },
                "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
            },
            "required": ["location", "unit"],
        },
    }
]

response = client.responses.create(
    model="gpt-4o",
    tools=tools,
    input="What is the weather like in Boston today?",
    tool_choice="auto",
    extra_headers=respan_params_header,
)
print(response)

Azure OpenAI

To call Azure OpenAI models, instead of using azure OpenAI’s client, the easier way is to use the OpenAI client.
1. Go to [Respan Providers](https://platform.respan.ai/platform/api/providers)
2. Add your Azure OpenAI credentials
3. Configure your Azure deployment settings
4. Use Azure models through the same Respan endpoint

View your analytics

Access your Respan dashboard to see detailed analytics

Next Steps