Skip to main content
  1. Sign up — Create an account at platform.respan.ai
  2. Create an API key — Generate one on the API keys page
  3. Add credits or a provider key — Add credits on the Credits page or connect your own provider key on the Integrations page
Add the Docs MCP to your AI coding tool to get help building with Respan. No API key needed.
{
  "mcpServers": {
    "respan-docs": {
      "url": "https://docs.respan.ai/mcp"
    }
  }
}

Overview

The Respan gateway provides an OpenAI-compatible API endpoint that gives you access to 250+ models from all major providers through a single API key and base URL.
EndpointBase URL
OpenAI-compatiblehttps://api.respan.ai/api/
Anthropic proxyhttps://api.respan.ai/api/anthropic/
Google Gemini proxyhttps://api.respan.ai/api/google/gemini
Environment Switching: Respan doesn’t support an env parameter in API calls. To switch between environments (test/production), use different API keys — one for your test environment and another for production. Manage keys in API Keys settings.

Quickstart

Step 1: Set environment variables

export RESPAN_API_KEY="YOUR_RESPAN_API_KEY"

Step 2: Make a request

import requests

def demo_call(input,
              model="gpt-4o-mini",
              token="YOUR_RESPAN_API_KEY"
              ):
    headers = {
        'Content-Type': 'application/json',
        'Authorization': f'Bearer {token}',
    }

    data = {
        'model': model,
        'messages': [{'role': 'user', 'content': input}],
    }

    response = requests.post(
        'https://api.respan.ai/api/chat/completions',
        headers=headers,
        json=data,
    )
    return response

messages = "Say 'Hello World'"
print(demo_call(messages).json())

Step 3: Verify

Open the Logs page to see your gateway requests.

Switch models

Change the model parameter to use any supported provider through the same endpoint:
# OpenAI
model = "gpt-4o"
# Anthropic
# model = "claude-sonnet-4-5-20250929"
# Google
# model = "gemini-1.5-pro"
# DeepSeek
# model = "deepseek-chat"

response = requests.post(
    'https://api.respan.ai/api/chat/completions',
    headers=headers,
    json={'model': model, 'messages': [{'role': 'user', 'content': 'Hello'}]},
)
Browse the full model list to see all available models.

OpenAI-compatible parameters

All standard OpenAI chat completion parameters are supported:
ParameterTypeDescription
messagesarrayList of messages in OpenAI format (role + content).
modelstringModel to use (e.g. gpt-4o-mini, claude-sonnet-4-5-20250929).
streambooleanStream back partial progress token by token.
temperaturenumberControls randomness (0-2).
max_tokensnumberMaximum tokens to generate.
top_pnumberNucleus sampling threshold.
frequency_penaltynumberPenalize tokens by existing frequency.
presence_penaltynumberPenalize tokens by whether they appear in text so far.
stoparrayStop sequences.
toolsarrayList of tools/functions the model may call.
tool_choicestring|objectControls tool selection (none, auto, or specific tool).
response_formatobjectForce JSON output (json_object, json_schema, or text).
nnumberNumber of completions to generate.
logprobsbooleanReturn log probabilities of output tokens.

Respan parameters

Pass Respan-specific parameters in the request body alongside OpenAI parameters. When using the OpenAI SDK, pass them via extra_body.

Observability

ParameterTypeDescription
customer_identifierstringTag to identify the user. See customer identifier.
metadataobjectCustom key-value pairs for filtering and search. See custom properties.
custom_identifierstringExtra indexed tag (shows as “Custom ID” in spans).
disable_logbooleanWhen true, only metrics are recorded — input/output messages are omitted.
request_breakdownbooleanReturns a summarization of the response (tokens, cost, latency).
{
  "model": "gpt-4o-mini",
  "messages": [{"role": "user", "content": "Hello"}],
  "customer_identifier": "user_123",
  "metadata": {"session_id": "abc123", "team": "ml"},
  "custom_identifier": "feature-x"
}

Reliability

ParameterTypeDescription
fallback_modelsarrayBackup models ranked by priority. See fallback models.
load_balance_groupobjectBalance requests across models. See load balancing.
retry_paramsobjectConfigure retries (retry_enabled, num_retries, retry_after). See retries.
{
  "model": "gpt-4o-mini",
  "messages": [{"role": "user", "content": "Hello"}],
  "fallback_models": ["gemini-1.5-pro", "claude-sonnet-4-5-20250929"],
  "retry_params": {
    "retry_enabled": true,
    "num_retries": 3,
    "retry_after": 1
  }
}

Caching

ParameterTypeDescription
cache_enabledbooleanEnable response caching. See caches.
cache_ttlnumberCache time-to-live in seconds (default: 30 days).
cache_optionsobjectSet cache_by_customer: true to scope cache per customer.

Credentials

ParameterTypeDescription
customer_credentialsobjectPass your customer’s provider API keys. See provider keys.
credential_overrideobjectOne-off credential overrides for specific models (e.g. Azure deployments).
model_name_mapobjectMap default model names to custom Azure deployment names.
{
  "model": "azure/gpt-4o",
  "credential_override": {
    "azure/gpt-4o": {
      "api_key": "your-azure-key",
      "api_base": "your-azure-base-url",
      "api_version": "2024-02-01"
    }
  }
}

Prompt management

ParameterTypeDescription
promptobjectUse a Respan-managed prompt template. See prompt management.
{
  "model": "gpt-4o-mini",
  "messages": [],
  "prompt": {
    "prompt_id": "your-prompt-id",
    "variables": {
      "user_name": "Sarah"
    }
  }
}

Response format

{
  "id": "chatcmpl-e1b9665b-c354-41c5-bbe5-178bd0b69773",
  "object": "chat.completion",
  "created": 1761546960,
  "model": "claude-sonnet-4-5-20250929",
  "choices": [
    {
      "index": 0,
      "finish_reason": "stop",
      "message": {
        "role": "assistant",
        "content": "I'm doing well, thank you for asking! How can I help you today?"
      }
    }
  ],
  "usage": {
    "completion_tokens": 20,
    "prompt_tokens": 2619,
    "total_tokens": 2639,
    "completion_tokens_details": {
      "accepted_prediction_tokens": 0,
      "audio_tokens": 0,
      "reasoning_tokens": 0,
      "rejected_prediction_tokens": 0
    },
    "prompt_tokens_details": {
      "audio_tokens": 0,
      "cached_tokens": 2601,
      "cache_creation_tokens": 0
    },
    "cache_creation_input_tokens": 0,
    "cache_read_input_tokens": 2601
  }
}

Next Steps