For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
DiscordPlatform
DocumentationIntegrationsAPI referenceSDKsChangelog
DocumentationIntegrationsAPI referenceSDKsChangelog
  • API Reference
      • POSTCreate chat completion
      • POSTCreate response
  • Reference
    • Filters API Reference
LogoLogo
DiscordPlatform
API ReferenceGateway

Create chat completion

POST
https://api.respan.ai/api/chat/completions
POST
/api/chat/completions
$curl -X POST https://api.respan.ai/api/chat/completions \
> -H "Authorization: Bearer sk_live_xxxxx" \
> -H "X-Respan-Route-Provider: vertex_ai" \
> -H "Content-Type: application/json" \
> -d '{
> "messages": [
> {}
> ],
> "model": "gpt-4o"
>}'
1{
2 "role": "user",
3 "content": [
4 {
5 "type": "text",
6 "text": "What's in this image?"
7 },
8 {
9 "type": "image_url",
10 "image_url": {
11 "url": "https://as1.ftcdn.net/v2/jpg/01/34/53/74/1000_F_134537443_VendrqyXIWyHrZgxdIsfyKUost734JDP.jpg"
12 }
13 }
14 ]
15}
Send a chat completion request through the Respan gateway. Supports 250+ models across OpenAI, Anthropic, Google, Azure, and more with automatic logging, fallbacks, caching, and prompt management. Accepts all [OpenAI chat completion parameters](https://platform.openai.com/docs/apis/chat). Respan-specific parameters can be passed three ways: 1. **Top-level body fields** - add directly to the request body 2. **Nested under `respan_params`** - explicit namespacing to avoid conflicts 3. **Header `X-Data-Respan-Params`** - base64-encoded JSON header Merge order: top-level body fields > `respan_params` > header. Legacy compatibility: - `keywordsai_params` is still accepted and merged into `respan_params` - `X-Data-Keywordsai-Params` is still accepted and auto-renamed internally When using the OpenAI SDK, pass Respan parameters via `extra_body`.
Was this page helpful?
Previous

Create response

Next
Built with

Send a chat completion request through the Respan gateway. Supports 250+ models across OpenAI, Anthropic, Google, Azure, and more with automatic logging, fallbacks, caching, and prompt management.

Accepts all OpenAI chat completion parameters. Respan-specific parameters can be passed three ways:

  1. Top-level body fields - add directly to the request body
  2. Nested under respan_params - explicit namespacing to avoid conflicts
  3. Header X-Data-Respan-Params - base64-encoded JSON header

Merge order: top-level body fields > respan_params > header.

Legacy compatibility:

  • keywordsai_params is still accepted and merged into respan_params
  • X-Data-Keywordsai-Params is still accepted and auto-renamed internally

When using the OpenAI SDK, pass Respan parameters via extra_body.

Headers

AuthorizationstringRequired

Bearer token. Use Bearer YOUR_API_KEY.

X-Data-Respan-ParamsstringOptional

Base64-encoded JSON object of Respan parameters. Legacy X-Data-Keywordsai-Params is still accepted.

X-Respan-Route-ProviderstringOptional

Pin the request to a specific provider without changing the model slug. Example: vertex_ai routes a claude-sonnet-4-5-20250929 request to Vertex AI Claude.

X-Respan-BetastringOptional

Comma-separated beta feature flags. Available: token-breakdown-2026-03-26, env-scoped-integrations-2026-03-28

Request

This endpoint expects an object.
messageslist of objectsRequired

Array of messages in the conversation. Each message has role (system, user, assistant, tool) and content.

modelstringRequired
Model to use. See [Models](https://platform.respan.ai/platform/models) for available options.
streambooleanOptional

Stream back partial progress token by token as server-sent events.

toolslist of objectsOptional
Tools the model may call. Currently only functions are supported.
tool_choiceobjectOptional

Controls tool selection. "none" = no tools, "auto" = model decides, or specify a tool object.

frequency_penaltydoubleOptional

Penalizes tokens based on frequency in text so far (-2 to 2).

max_tokensdoubleOptional
Maximum tokens to generate.
temperaturedoubleOptionalDefaults to 1

Sampling temperature (0-2). Higher = more random.

ndoubleOptionalDefaults to 1

Number of completions to generate. Note: costs multiply with n.

logprobsbooleanOptional
Return log probabilities of output tokens.
echobooleanOptional
Echo back the prompt in addition to the completion
stoplist of stringsOptional
Stop sequences where generation halts.
presence_penaltydoubleOptional

Penalizes tokens already present in text (-2 to 2).

logit_biasobjectOptional
Used to modify the probability of tokens appearing in the response
response_formatobjectOptional

Output format. Set {"type": "json_schema", "json_schema": {...}} for structured output, or {"type": "json_object"} for JSON mode.

parallel_tool_callsbooleanOptional
Enable parallel function calling during tool use.
load_balance_groupobjectOptional

Load balance group selection. Use {"group_id": "..."} to route through a configured group.

fallback_modelslist of stringsOptional

Backup models (ranked by priority) if the primary model fails.

customer_credentialsobjectOptional

Per-customer LLM provider credentials. Keys are provider names, values are API keys.

credential_overrideobjectOptional

One-off credential overrides per provider. Overrides uploaded provider keys for this request only.

cache_enabledbooleanOptional

Enable response caching. See Caching.

cache_ttldoubleOptional

Cache time-to-live in seconds. Default: 30 days.

cache_optionsobjectOptional

Cache behavior options. Properties: cache_by_customer, is_cached_by_model, omit_log.

promptobjectOptional
Prompt template config. Properties: `prompt_id` (required), `variables` (template variables), `version` (number, or `"latest"` for draft), `echo` (return rendered prompt), `override` (use override_params), `override_params` (OpenAI params to override), `schema_version` (`1` = legacy, `2` = prompt config wins). See [Prompt management](/docs/documentation/features/prompt-management/advanced).
retry_paramsobjectOptional

Retry config. Properties: retry_enabled (boolean, required), num_retries (number), retry_after (seconds to wait).

disable_logbooleanOptional

When true, omits input/output from the log. Metrics (tokens, cost, latency) are still recorded.

model_name_mapobjectOptional
Azure deployment name mapping. Maps your custom Azure deployment names to standard model names.
modelslist of stringsOptional
Model list for LLM router selection.
exclude_providerslist of stringsOptional
Providers to exclude from routing. All models under excluded providers are skipped.
exclude_modelslist of stringsOptional
Specific models to exclude from routing.
metadataobjectOptional

Custom key-value metadata attached to the span.

custom_identifierstringOptional
Indexed custom tag for fast querying.
customer_identifierstringOptional<=254 characters
End user identifier for analytics and budgets.
customer_paramsobjectOptional

Extended customer info. Properties: customer_identifier (required), group_identifier, name, email, period_budget, budget_duration (daily/weekly/monthly), total_budget, markup_percentage.

request_breakdownbooleanOptional
Return response metrics summary in the response body. For streaming, metrics appear in the final chunk.
positive_feedbackbooleanOptional

User feedback. true = liked, false = disliked.

load_balance_modelslist of objectsOptional

Inline load balancing options. Each item can include model, weight, and optional credentials.

thread_identifierstringOptional

Conversation thread ID. Spans with the same thread_identifier are grouped together.

propertiesobjectOptional

Typed metadata preserving native types (numbers, booleans, nested objects). Unlike metadata which coerces to strings.

retriesintegerOptionalDefaults to 0
Number of retries on failure.
weightdoubleOptional
Load balancing weight.
span_namestringOptional
Custom span name for tracing.
respan_paramsobjectOptional
Namespaced container for all Respan parameters. Alternative to passing them at top level.

Response

Successful response for Create chat completion
rolestring
contentlist of objects

Errors

401
Unauthorized Error

Model to use. See Models for available options.

Prompt template config. Properties: prompt_id (required), variables (template variables), version (number, or "latest" for draft), echo (return rendered prompt), override (use override_params), override_params (OpenAI params to override), schema_version (1 = legacy, 2 = prompt config wins). See Prompt management.