Provider: Replicate
Provider: Replicate
Use Respan Gateway to call Replicate-hosted models (meta/llama-3-70b-instruct, meta/llama-3-8b-instruct, mistralai/mixtral-8x7b-instruct-v0.1, and the rest) while keeping unified observability (logs, cost, latency, reliability) in Respan.
Quick setup
Add credits (recommended)
Top up credits to pay through Respan. No Replicate token required, Respan handles provider auth and billing.
Prefer to route through your own Replicate account? See Use your own Replicate key.
Send your first request
Pick the integration that matches your stack. The base URL is https://api.respan.ai/api and the only key needed is your RESPAN_API_KEY.
Replicate SDK
OpenAI SDK
Respan API
The Replicate SDK uses its own non-OpenAI protocol, so the cleanest way to log Replicate calls through Respan is the OpenAI-compatible gateway shown below. If you want to keep using the native Replicate client, see Log without proxying to forward usage to Respan asynchronously.
More integrations
Replicate-hosted chat models work with every Respan gateway integration:
Switch models
Change the model parameter to call any supported model through the same client. Use the replicate/ prefix to disambiguate when routing across providers. Browse the full list on the Models page.
Use your own Replicate key (BYOK)
Credits are the default path. If you’d rather bill Replicate directly, attach your own provider key.
Global (UI)
Per-request (Code)
Override credentials per model (Optional)
Use credential_override when one model on a request should use a different Replicate key than the default.
Log without proxying (Optional)
Already calling Replicate directly? Send logs to Respan asynchronously to track cost, latency, and performance for those external calls.
See the logging guide for the full setup.