LlamaIndex (gateway)

Route LlamaIndex LLM and embedding calls through the Respan gateway to use 250+ models from different providers. Only your Respan API key is needed. No separate provider key is required when the provider is configured in Respan.

Setup

1

Install packages

$pip install llama-index llama-index-llms-openai llama-index-embeddings-openai
2

Set environment variables

$export RESPAN_API_KEY="YOUR_RESPAN_API_KEY"
$export RESPAN_BASE_URL="https://api.respan.ai/api"
3

Point LlamaIndex to the Respan gateway

1import os
2from llama_index.core import Document, Settings, SummaryIndex
3from llama_index.embeddings.openai import OpenAIEmbedding
4from llama_index.llms.openai import OpenAI
5
6respan_api_key = os.environ["RESPAN_API_KEY"]
7respan_base_url = os.getenv("RESPAN_BASE_URL", "https://api.respan.ai/api")
8
9Settings.llm = OpenAI(
10 api_key=respan_api_key,
11 api_base=respan_base_url,
12 model="gpt-5-mini",
13)
14Settings.embed_model = OpenAIEmbedding(
15 api_key=respan_api_key,
16 api_base=respan_base_url,
17 model="text-embedding-3-small",
18)
19
20index = SummaryIndex.from_documents([
21 Document(text="The Respan gateway routes LlamaIndex calls to hosted models.")
22])
23response = index.as_query_engine().query("What does the gateway do?")
24print(response)

Switch models

Change the model parameter to use another OpenAI model through the same gateway-backed endpoint.

1Settings.llm = OpenAI(
2 api_key=respan_api_key,
3 api_base=respan_base_url,
4 model="gpt-5.5",
5)

OpenAI is LlamaIndex’s OpenAI-compatible LLM adapter. This page avoids showing Claude or Gemini inside that OpenAI-named adapter; use the Respan API or OpenAI SDK gateway pages for provider-neutral Claude and Gemini examples.

See Respan params & metadata for the full list.