Provider: Google Vertex AI

This page is for Respan LLM Gateway users.

Use Respan Gateway to call Google Vertex AI models (gemini-3.5-flash, gemini-3-pro, Anthropic Claude on Vertex, and the rest) while keeping unified observability (logs, cost, latency, reliability) in Respan.

Quick setup

Get a Respan API key

Add credits (recommended)

Top up credits to pay through Respan. No Google Cloud project required, Respan handles provider auth and billing.

Prefer to route through your own Google Cloud project? See Use your own Google Vertex AI key.

Send your first request

Pick the integration that matches your stack. The base URL is https://api.respan.ai/api and the only key needed is your RESPAN_API_KEY.

Vercel AI SDK

Respan API

Use @ai-sdk/openai and point createOpenAI at the Respan gateway. Pass any Vertex model with the vertex_ai/ prefix.

TypeScript

1 import { createOpenAI } from "@ai-sdk/openai";
2 import { generateText } from "ai";
3 
4 const respan = createOpenAI({
5   apiKey: process.env.RESPAN_API_KEY!,
6   baseURL: "https://api.respan.ai/api",
7 });
8 
9 const result = await generateText({
10   model: respan("vertex_ai/gemini-3.5-flash"),
11   prompt: "Hello, Gemini on Vertex!",
12 });
13 console.log(result.text);

More integrations

Vertex AI models work with every Respan gateway integration:

Switch models

Change the model parameter to call any supported model through the same client. Use the vertex_ai/ prefix to disambiguate when routing across providers. Browse the full list on the Models page.

1 client.chat.completions.create(model="vertex_ai/gemini-3.5-flash", messages=messages)
2 client.chat.completions.create(model="vertex_ai/gemini-3-pro", messages=messages)
3 client.chat.completions.create(model="vertex_ai/claude-sonnet-4-5@20250929", messages=messages)
4 client.chat.completions.create(model="openai/gpt-5.5", messages=messages)
5 client.chat.completions.create(model="anthropic/claude-sonnet-4-5-20250929", messages=messages)

Use your own Google Vertex AI key (BYOK)

Credits are the default path. If you’d rather bill Google Cloud directly, attach your own service-account credentials.

Global (UI)

Per-request (Code)

Google credentials can be tricky. Follow this walkthrough if you need help finding the required fields:

Open Providers

Go to the Providers page.

Add Google Vertex AI

Select Google Vertex AI and fill in the required credential fields:

vertex_ai_project (your Google Cloud project ID)
vertex_ai_location (the Vertex AI region to use)
vertex_ai_credentials (your Google service-account or application-default credential JSON object)

Load balancing (Optional)

Add multiple credential sets and use Load balancing weight to distribute traffic across them.

Override credentials per model (Optional)

Use credential_override when one model on a request should use a different Vertex project than the default.

1 {
2   "customer_credentials": {
3     "google_vertex_ai": {
4       "vertex_ai_project": "your-project",
5       "vertex_ai_location": "us-central1",
6       "vertex_ai_credentials": {
7         "type": "service_account",
8         "project_id": "your-project"
9       }
10     }
11   },
12   "credential_override": {
13     "vertex_ai/gemini-3.5-flash": {
14       "vertex_ai_project": "ANOTHER_VERTEX_AI_PROJECT",
15       "vertex_ai_location": "ANOTHER_VERTEX_AI_LOCATION",
16       "vertex_ai_credentials": {
17         "type": "service_account",
18         "project_id": "another-project"
19       }
20     }
21   }
22 }

Log without proxying (Optional)

Already calling Vertex AI directly? Send logs to Respan asynchronously to track cost, latency, and performance for those external calls.

1 import requests
2 
3 requests.post(
4     "https://api.respan.ai/api/request-logs/create/",
5     headers={
6         "Authorization": "Bearer YOUR_RESPAN_API_KEY",
7         "Content-Type": "application/json",
8     },
9     json={
10         "model": "vertex_ai/gemini-3.5-flash",
11         "prompt_messages": [{"role": "user", "content": "Hello, how are you?"}],
12         "completion_message": {"role": "assistant", "content": "Hello from Vertex AI through Respan."},
13         "cost": 0.001,
14         "generation_time": 1.2,
15         "customer_params": {"customer_identifier": "user_123"},
16     },
17 )

See the logging guide for the full setup.