Provider: Nextbit

Call Nextbit models through Respan Gateway with unified logs, cost, and latency.
This page is for Respan LLM Gateway users.

Use Respan Gateway to call Nextbit-hosted models (open-source LLMs served through Nextbit’s inference platform) while keeping unified observability (logs, cost, latency, reliability) in Respan.

Quick setup

1

Get a Respan API key

Sign up and create a key on the API keys page.

Send your first request

Pick the integration that matches your stack. The base URL is https://api.respan.ai/api and the only key needed is your RESPAN_API_KEY.

Nextbit is OpenAI-compatible. Point the OpenAI SDK at the Respan gateway and call any Nextbit model.

1from openai import OpenAI
2
3client = OpenAI(
4 api_key="YOUR_RESPAN_API_KEY",
5 base_url="https://api.respan.ai/api",
6)
7
8response = client.chat.completions.create(
9 model="nextbit256/cydonia-2b",
10 messages=[{"role": "user", "content": "Hello, Nextbit!"}],
11)
12print(response.choices[0].message.content)

More integrations

Nextbit models work with every Respan gateway integration:

Switch models

Change the model parameter to call any supported model through the same client. Use the nextbit256/ prefix to disambiguate when routing across providers. Browse the full list on the Models page.

1client.chat.completions.create(model="nextbit256/cydonia-2b", messages=messages)
2client.chat.completions.create(model="openai/gpt-5.5", messages=messages)
3client.chat.completions.create(model="anthropic/claude-sonnet-4-5", messages=messages)

Use your own Nextbit key (BYOK)

Credits are the default path. If you’d rather bill Nextbit directly, attach your own provider key.

1

Open Providers

Go to the Providers page.

2

Add Nextbit

Select Nextbit and paste your nextbit256.api_key.

3

Load balancing (Optional)

Add multiple credential sets and use Load balancing weight to distribute traffic across them.

Override credentials per model (Optional)

Use credential_override when one model on a request should use a different Nextbit key than the default.

1{
2 "customer_credentials": {
3 "nextbit256": { "api_key": "YOUR_NEXTBIT256_API_KEY" }
4 },
5 "credential_override": {
6 "nextbit256/cydonia-2b": { "api_key": "ANOTHER_NEXTBIT256_API_KEY" }
7 }
8}

Log without proxying (Optional)

Already calling Nextbit directly? Send logs to Respan asynchronously to track cost, latency, and performance for those external calls.

1import requests
2
3requests.post(
4 "https://api.respan.ai/api/request-logs/create/",
5 headers={
6 "Authorization": "Bearer YOUR_RESPAN_API_KEY",
7 "Content-Type": "application/json",
8 },
9 json={
10 "model": "nextbit256/cydonia-2b",
11 "prompt_messages": [{"role": "user", "content": "Hello, how are you?"}],
12 "completion_message": {"role": "assistant", "content": "Hello from Nextbit through Respan."},
13 "cost": 0.001,
14 "generation_time": 1.2,
15 "customer_params": {"customer_identifier": "user_123"},
16 },
17)

See the logging guide for the full setup.