Provider: Together AI

This page is for Respan LLM Gateway users.

Use Respan Gateway to call Together AI models (meta-llama/Llama-3.3-70B-Instruct-Turbo, Qwen/Qwen2.5-72B-Instruct-Turbo, deepseek-ai/DeepSeek-V3, and more) while keeping unified observability (logs, cost, latency, reliability) in Respan.

Quick setup

Get a Respan API key

Add credits (recommended)

Top up credits to pay through Respan. No Together AI key required, Respan handles provider auth and billing.

Prefer to route through your own Together AI account? See Use your own Together AI key.

Send your first request

Pick the integration that matches your stack. The base URL is https://api.respan.ai/api and the only key needed is your RESPAN_API_KEY.

Together SDK (Python)

OpenAI SDK

Respan API

The Together Python SDK accepts a base_url override. Point it at the Respan gateway to keep your existing Together code.

Python

1 from together import Together
2 
3 client = Together(
4     api_key="YOUR_RESPAN_API_KEY",
5     base_url="https://api.respan.ai/api",
6 )
7 
8 response = client.chat.completions.create(
9     model="together_ai/meta-llama/Llama-3.3-70B-Instruct-Turbo",
10     messages=[{"role": "user", "content": "Hello, Together!"}],
11 )
12 print(response.choices[0].message.content)

More integrations

Together AI models work with every Respan gateway integration:

Switch models

Change the model parameter to call any supported model through the same client. Use the together_ai/ prefix to disambiguate when routing across providers. Browse the full list on the Models page.

1 client.chat.completions.create(model="together_ai/meta-llama/Llama-3.3-70B-Instruct-Turbo", messages=messages)
2 client.chat.completions.create(model="together_ai/Qwen/Qwen2.5-72B-Instruct-Turbo", messages=messages)
3 client.chat.completions.create(model="together_ai/deepseek-ai/DeepSeek-V3", messages=messages)
4 client.chat.completions.create(model="openai/gpt-5.5", messages=messages)
5 client.chat.completions.create(model="anthropic/claude-sonnet-4-5", messages=messages)

Use your own Together AI key (BYOK)

Credits are the default path. If you’d rather bill Together AI directly, attach your own provider key.

Global (UI)

Per-request (Code)

Open Providers

Go to the Providers page.

Add Together AI

Select Together AI and paste your togetherai.api_key. Grab one from the Together API keys page.

Load balancing (Optional)

Add multiple credential sets and use Load balancing weight to distribute traffic across them.

Override credentials per model (Optional)

Use credential_override when one model on a request should use a different Together AI key than the default.

1 {
2   "customer_credentials": {
3     "togetherai": { "api_key": "YOUR_TOGETHERAI_API_KEY" }
4   },
5   "credential_override": {
6     "together_ai/meta-llama/Llama-3.3-70B-Instruct-Turbo": { "api_key": "ANOTHER_TOGETHERAI_API_KEY" }
7   }
8 }

Log without proxying (Optional)

Already calling Together AI directly? Send logs to Respan asynchronously to track cost, latency, and performance for those external calls.

1 import requests
2 
3 requests.post(
4     "https://api.respan.ai/api/request-logs/create/",
5     headers={
6         "Authorization": "Bearer YOUR_RESPAN_API_KEY",
7         "Content-Type": "application/json",
8     },
9     json={
10         "model": "together_ai/meta-llama/Llama-3.3-70B-Instruct-Turbo",
11         "prompt_messages": [{"role": "user", "content": "Hello, how are you?"}],
12         "completion_message": {"role": "assistant", "content": "Hello from Together AI through Respan."},
13         "cost": 0.001,
14         "generation_time": 1.2,
15         "customer_params": {"customer_identifier": "user_123"},
16     },
17 )

See the logging guide for the full setup.