Provider: Fireworks AI

Call Fireworks AI models through Respan Gateway with unified logs, cost, and latency.
This page is for Respan LLM Gateway users.

Use Respan Gateway to call Fireworks AI models (deepseek-v3-0324, gpt-oss-120b, llama-v3p3-70b-instruct, and more) while keeping unified observability (logs, cost, latency, reliability) in Respan.

Quick setup

1

Get a Respan API key

Sign up and create a key on the API keys page.

Send your first request

Pick the integration that matches your stack. The base URL is https://api.respan.ai/api and the only key needed is your RESPAN_API_KEY.

The Fireworks Python SDK is built on the OpenAI client and accepts a base_url override. Point it at the Respan gateway to keep your existing Fireworks code.

Python
1from fireworks.client import Fireworks
2
3client = Fireworks(
4 api_key="YOUR_RESPAN_API_KEY",
5 base_url="https://api.respan.ai/api",
6)
7
8response = client.chat.completions.create(
9 model="fireworks_ai/llama-v3p3-70b-instruct",
10 messages=[{"role": "user", "content": "Hello, Fireworks!"}],
11)
12print(response.choices[0].message.content)

More integrations

Fireworks AI models work with every Respan gateway integration:

Switch models

Change the model parameter to call any supported model through the same client. Use the fireworks_ai/ prefix to disambiguate when routing across providers. Browse the full list on the Models page.

1client.chat.completions.create(model="fireworks_ai/deepseek-v3-0324", messages=messages)
2client.chat.completions.create(model="fireworks_ai/gpt-oss-120b", messages=messages)
3client.chat.completions.create(model="fireworks_ai/llama-v3p3-70b-instruct", messages=messages)
4client.chat.completions.create(model="openai/gpt-5.5", messages=messages)
5client.chat.completions.create(model="anthropic/claude-sonnet-4-5", messages=messages)

Use your own Fireworks AI key (BYOK)

Credits are the default path. If you’d rather bill Fireworks AI directly, attach your own provider key.

1

Open Providers

Go to the Providers page.

2

Add Fireworks AI

Select Fireworks AI and paste your fireworks.api_key. Grab one from the Fireworks API keys page.

3

Load balancing (Optional)

Add multiple credential sets and use Load balancing weight to distribute traffic across them.

Override credentials per model (Optional)

Use credential_override when one model on a request should use a different Fireworks AI key than the default.

1{
2 "customer_credentials": {
3 "fireworks": { "api_key": "YOUR_FIREWORKS_API_KEY" }
4 },
5 "credential_override": {
6 "fireworks_ai/llama-v3p3-70b-instruct": { "api_key": "ANOTHER_FIREWORKS_API_KEY" }
7 }
8}

Log without proxying (Optional)

Already calling Fireworks AI directly? Send logs to Respan asynchronously to track cost, latency, and performance for those external calls.

1import requests
2
3requests.post(
4 "https://api.respan.ai/api/request-logs/create/",
5 headers={
6 "Authorization": "Bearer YOUR_RESPAN_API_KEY",
7 "Content-Type": "application/json",
8 },
9 json={
10 "model": "fireworks_ai/llama-v3p3-70b-instruct",
11 "prompt_messages": [{"role": "user", "content": "Hello, how are you?"}],
12 "completion_message": {"role": "assistant", "content": "Hello from Fireworks AI through Respan."},
13 "cost": 0.001,
14 "generation_time": 1.2,
15 "customer_params": {"customer_identifier": "user_123"},
16 },
17)

See the logging guide for the full setup.