If you're shipping LLM features past month 6 of production, you need a gateway. Provider outages happen. Cost guardrails matter. Switching between models without redeploying app code is the difference between a 30-minute decision and a 30-day project. This is the honest list of LLM gateways to do that work in 2026 — including ours.

A note on bias: we ship Respan, so we'd rank ourselves favorably. We've tried hard to be specific about what each tool is good at, including our own weaknesses.

Quick comparison

Gateway	Best for	Self-host	Free tier	Pricing
Respan	Gateway + observability + evals + prompts in one	Enterprise	Yes	$$
OpenRouter	Widest model catalog, simplest integration	No	Yes	$$
LiteLLM	Open-source self-host, broad model support	Yes (OSS)	Yes	$
Portkey	Managed gateway with strong governance	Enterprise only	Yes	$$$
Cloudflare AI Gateway	Edge-routed, low-latency	No	Yes	$
Helicone	Lightweight proxy with cost analytics	Yes (OSS)	Yes	$
Bifrost	Lightweight self-hostable	Yes (OSS)	Yes	Free
Vercel AI Gateway	Vercel-native AI applications	No	Yes	$$

What to evaluate

Criteria that matter:

Models supported: count + how fast new models are added
OpenAI-compatible drop-in: most teams already have OpenAI-format code
Provider fallback: automatic failover between providers (e.g., Anthropic → Bedrock)
Caching: exact-match and/or semantic
Rate limiting / budgets: per user, per feature, per dollar
Cost guardrails and analytics: alerting on cost spikes, attribution by feature
Observability integration: traces, eval scores attached
Self-host: data residency requirements

1. Respan

Best for: Teams that want gateway + observability + evals + prompts in one platform.

The story: Most gateways listed below do gateway well. The structural question is whether your stack ends up with a gateway + a separate observability tool + a separate eval tool + a separate prompt management tool — four products, four invoices, four integrations. Respan is one platform that owns all four primitives.

Pros:

500+ models routable through unified OpenAI-compatible API
Provider fallback configured per request
Exact-match + semantic caching with TTL config
Per-user / per-feature budgets and rate limits
Full observability + evals + prompt management built in
~10ms added P95 latency overhead (measured)

Cons:

Smaller community than OpenRouter on the gateway dimension specifically
Less battle-tested at the "10-year incumbent" scale of Cloudflare
Self-host on Enterprise only

Pricing: Free tier with generous limits. Pro and Enterprise tiers.

→ See Respan's gateway in product

2. OpenRouter

Best for: Widest model catalog and the simplest integration.

The story: OpenRouter's distinctive value is its breadth — they support more models than anyone else, including obscure / experimental ones. The integration is dead simple: change your base URL and you have access to 300+ models.

Pros:

Largest model catalog in the gateway market
Simple OpenAI-compatible drop-in
Provider fallback supported
Strong community

Cons:

No bundled observability or evals
No semantic caching
No prompt management
No self-host

Pricing: Pay-per-use plus small markup. Free tier exists.

See OpenRouter alternatives

3. LiteLLM

Best for: Open-source self-host with broad model support.

The story: LiteLLM is the OSS LLM gateway that supports 100+ models behind a single OpenAI-compatible API. Self-hostable, configurable, popular in environments where a cloud gateway isn't an option.

Pros:

Open source, MIT-licensed, self-hostable
Broad model support (100+)
Active community, fast pace of new model integration
Free if self-hosted

Cons:

Self-hosting is real work
No built-in observability beyond basic logging
No managed cloud option without third-party hosting

Pricing: Open source free. Cloud-managed offerings via partners.

4. Portkey

Best for: Managed gateway with strong governance features.

The story: Portkey is the gateway pitched at enterprises with strict governance requirements. Audit logs, role-based access control, request signing, advanced policy enforcement.

Pros:

Enterprise governance features (audit, RBAC, SSO)
250+ models supported
Provider fallback, caching, cost guardrails
Observability built in

Cons:

Self-host on Enterprise only
Pricing is opaque at the upper tiers
Less developer-friendly than OpenRouter at the entry tier

Pricing: Tiered, with Enterprise pricing custom.

5. Cloudflare AI Gateway

Best for: Edge-routed AI applications with low latency.

The story: Cloudflare's gateway runs on their edge network, so requests are routed close to the user. Tight integration with the broader Cloudflare stack (Workers, R2, D1) makes it the obvious choice if you're already on Cloudflare.

Pros:

Edge-routed for low latency
Tight Cloudflare integration
Caching is mature
Generous free tier

Cons:

Smaller model catalog than OpenRouter or Respan (50+ models)
Most useful inside the Cloudflare ecosystem
Less depth on observability than dedicated tools
No self-host (Cloudflare-managed only)

Pricing: Pay-per-request with generous free tier.

6. Helicone

Best for: Teams that want a lightweight cost gateway with proxy installation.

The story: Helicone started as an observability tool and added gateway capabilities. The proxy mode is the easiest install of any gateway on this list (one base URL change).

Pros:

Easiest install — proxy mode requires no SDK changes
Strong cost analytics
Open source self-host available
Good free tier

Cons:

Less depth on agent tracing (proxy can't see agent state)
Smaller model catalog than OpenRouter or LiteLLM
Prompt management is basic

Pricing: Generous free tier. Pro and Enterprise tiers reasonable.

See Helicone alternatives

7. Bifrost

Best for: Teams wanting a lightweight self-hostable open-source gateway.

The story: Bifrost is a newer open-source gateway focused on simplicity and self-hostability. Smaller in scope than LiteLLM, easier to deploy, less feature surface.

Pros:

Open source
Lightweight, easy to deploy
Active development

Cons:

Smaller community
Less feature-rich than LiteLLM or Respan
Newer / less battle-tested

Pricing: Free open source.

8. Vercel AI Gateway

Best for: Teams already deeply on Vercel building AI-native applications.

The story: Vercel's gateway is designed for AI applications running on Vercel's platform. Tight integration with the Vercel AI SDK and Vercel's broader infra.

Pros:

First-party for Vercel-deployed apps
Tight Vercel AI SDK integration
Edge-routed via Vercel network

Cons:

Best inside the Vercel ecosystem; awkward outside
Smaller model catalog than OpenRouter or Respan
Newer product

Pricing: Tiered with Vercel's broader pricing.

How to choose

Quick decision framework:

Want gateway + observability + evals + prompts in one? → Respan
Need maximum model variety, OpenAI-compatible drop-in? → OpenRouter
Need open-source self-host? → LiteLLM (most mature) or Bifrost (lighter)
Need enterprise governance? → Portkey
Already on Cloudflare? → Cloudflare AI Gateway
Want fastest install with proxy? → Helicone
Already on Vercel? → Vercel AI Gateway

Common mistakes

Skipping a gateway "for now" — you'll need one within 6 months and migrating production traffic later is painful.
No fallback configured — half a gateway. Configure provider fallback on day one.
Semantic cache enabled by default — wrong cache hit ships stale answers. Start with exact-match, validate semantic before enabling.
No per-feature cost guardrails — the first runaway agent drains your monthly budget in 2 hours.
Treating the gateway as the place for routing logic — keep model-routing rules in your app, not buried in gateway config.

FAQ

Why do I need a gateway? Provider outages happen, cost guardrails matter, and switching models without app code changes is the difference between a 30-minute decision and a 30-day project. See our LLM Gateway pillar.

Can I just call providers directly? For toy projects, yes. For production, no — you'll lose to the first 25-minute Anthropic outage that takes your customer support agent offline.

Does a gateway add latency? A well-designed gateway adds 5-15ms P95 overhead. With caching enabled, the gateway often reduces median latency because cache hits return in single-digit milliseconds.

Which has the largest model catalog? OpenRouter (300+ models). Respan and Portkey are close (250-500+).

Which has the best free tier? Cloudflare AI Gateway and Respan both have generous free tiers. Helicone is competitive.

Should I self-host or use cloud? Cloud for most teams (less ops burden). Self-host if you have data residency or compliance requirements that block cloud.

Can I switch gateways later? Yes — if the gateway is OpenAI-compatible (most are), you change a base URL. Lock-in risk is highest with proprietary SDKs and lowest with OpenAI-compatible interfaces.

A note on bias: we ship Respan, so we'd rank ourselves favorably. We've tried hard to be specific about what each tool is good at, including our own weaknesses.

Quick comparison

Gateway	Best for	Self-host	Free tier	Pricing
Respan	Gateway + observability + evals + prompts in one	Enterprise	Yes	$$
OpenRouter	Widest model catalog, simplest integration	No	Yes	$$
LiteLLM	Open-source self-host, broad model support	Yes (OSS)	Yes	$
Portkey	Managed gateway with strong governance	Enterprise only	Yes	$$$
Cloudflare AI Gateway	Edge-routed, low-latency	No	Yes	$
Helicone	Lightweight proxy with cost analytics	Yes (OSS)	Yes	$
Bifrost	Lightweight self-hostable	Yes (OSS)	Yes	Free
Vercel AI Gateway	Vercel-native AI applications	No	Yes	$$

What to evaluate

Criteria that matter:

Models supported: count + how fast new models are added
OpenAI-compatible drop-in: most teams already have OpenAI-format code
Provider fallback: automatic failover between providers (e.g., Anthropic → Bedrock)
Caching: exact-match and/or semantic
Rate limiting / budgets: per user, per feature, per dollar
Cost guardrails and analytics: alerting on cost spikes, attribution by feature
Observability integration: traces, eval scores attached
Self-host: data residency requirements

1. Respan

Best for: Teams that want gateway + observability + evals + prompts in one platform.

Pros:

500+ models routable through unified OpenAI-compatible API
Provider fallback configured per request
Exact-match + semantic caching with TTL config
Per-user / per-feature budgets and rate limits
Full observability + evals + prompt management built in
~10ms added P95 latency overhead (measured)

Cons:

Smaller community than OpenRouter on the gateway dimension specifically
Less battle-tested at the "10-year incumbent" scale of Cloudflare
Self-host on Enterprise only

Pricing: Free tier with generous limits. Pro and Enterprise tiers.

→ See Respan's gateway in product

2. OpenRouter

Best for: Widest model catalog and the simplest integration.

Pros:

Largest model catalog in the gateway market
Simple OpenAI-compatible drop-in
Provider fallback supported
Strong community

Cons:

No bundled observability or evals
No semantic caching
No prompt management
No self-host

Pricing: Pay-per-use plus small markup. Free tier exists.

See OpenRouter alternatives

3. LiteLLM

Best for: Open-source self-host with broad model support.

Pros:

Open source, MIT-licensed, self-hostable
Broad model support (100+)
Active community, fast pace of new model integration
Free if self-hosted

Cons:

Self-hosting is real work
No built-in observability beyond basic logging
No managed cloud option without third-party hosting

Pricing: Open source free. Cloud-managed offerings via partners.

4. Portkey

Best for: Managed gateway with strong governance features.

The story: Portkey is the gateway pitched at enterprises with strict governance requirements. Audit logs, role-based access control, request signing, advanced policy enforcement.

Pros:

Enterprise governance features (audit, RBAC, SSO)
250+ models supported
Provider fallback, caching, cost guardrails
Observability built in

Cons:

Self-host on Enterprise only
Pricing is opaque at the upper tiers
Less developer-friendly than OpenRouter at the entry tier

Pricing: Tiered, with Enterprise pricing custom.

5. Cloudflare AI Gateway

Best for: Edge-routed AI applications with low latency.

Pros:

Edge-routed for low latency
Tight Cloudflare integration
Caching is mature
Generous free tier

Cons:

Smaller model catalog than OpenRouter or Respan (50+ models)
Most useful inside the Cloudflare ecosystem
Less depth on observability than dedicated tools
No self-host (Cloudflare-managed only)

Pricing: Pay-per-request with generous free tier.

6. Helicone

Best for: Teams that want a lightweight cost gateway with proxy installation.

The story: Helicone started as an observability tool and added gateway capabilities. The proxy mode is the easiest install of any gateway on this list (one base URL change).

Pros:

Easiest install — proxy mode requires no SDK changes
Strong cost analytics
Open source self-host available
Good free tier

Cons:

Less depth on agent tracing (proxy can't see agent state)
Smaller model catalog than OpenRouter or LiteLLM
Prompt management is basic

Pricing: Generous free tier. Pro and Enterprise tiers reasonable.

See Helicone alternatives

7. Bifrost

Best for: Teams wanting a lightweight self-hostable open-source gateway.

The story: Bifrost is a newer open-source gateway focused on simplicity and self-hostability. Smaller in scope than LiteLLM, easier to deploy, less feature surface.

Pros:

Open source
Lightweight, easy to deploy
Active development

Cons:

Smaller community
Less feature-rich than LiteLLM or Respan
Newer / less battle-tested

Pricing: Free open source.

8. Vercel AI Gateway

Best for: Teams already deeply on Vercel building AI-native applications.

The story: Vercel's gateway is designed for AI applications running on Vercel's platform. Tight integration with the Vercel AI SDK and Vercel's broader infra.

Pros:

First-party for Vercel-deployed apps
Tight Vercel AI SDK integration
Edge-routed via Vercel network

Cons:

Best inside the Vercel ecosystem; awkward outside
Smaller model catalog than OpenRouter or Respan
Newer product

Pricing: Tiered with Vercel's broader pricing.

How to choose

Quick decision framework:

Want gateway + observability + evals + prompts in one? → Respan
Need maximum model variety, OpenAI-compatible drop-in? → OpenRouter
Need open-source self-host? → LiteLLM (most mature) or Bifrost (lighter)
Need enterprise governance? → Portkey
Already on Cloudflare? → Cloudflare AI Gateway
Want fastest install with proxy? → Helicone
Already on Vercel? → Vercel AI Gateway

Common mistakes

Skipping a gateway "for now" — you'll need one within 6 months and migrating production traffic later is painful.
No fallback configured — half a gateway. Configure provider fallback on day one.
Semantic cache enabled by default — wrong cache hit ships stale answers. Start with exact-match, validate semantic before enabling.
No per-feature cost guardrails — the first runaway agent drains your monthly budget in 2 hours.
Treating the gateway as the place for routing logic — keep model-routing rules in your app, not buried in gateway config.

FAQ

Can I just call providers directly? For toy projects, yes. For production, no — you'll lose to the first 25-minute Anthropic outage that takes your customer support agent offline.

Which has the largest model catalog? OpenRouter (300+ models). Respan and Portkey are close (250-500+).

Which has the best free tier? Cloudflare AI Gateway and Respan both have generous free tiers. Helicone is competitive.

Should I self-host or use cloud? Cloud for most teams (less ops burden). Self-host if you have data residency or compliance requirements that block cloud.

8 Best LLM Gateways in 2026

Quick comparison

What to evaluate

1. Respan

2. OpenRouter

3. LiteLLM

4. Portkey

5. Cloudflare AI Gateway

6. Helicone

7. Bifrost

8. Vercel AI Gateway

How to choose

Common mistakes

FAQ

Related articles

8 Best LLM Evaluation Tools in 2026

9 Best LLM Observability Tools in 2026

10 Best Prompt Engineering Tools in 2026

Built for AI agents.
Break less.
Ship more.

8 Best LLM Gateways in 2026

Quick comparison

What to evaluate

1. Respan

2. OpenRouter

3. LiteLLM

4. Portkey

5. Cloudflare AI Gateway

6. Helicone

7. Bifrost

8. Vercel AI Gateway

How to choose

Common mistakes

FAQ

Related articles

8 Best LLM Evaluation Tools in 2026

9 Best LLM Observability Tools in 2026

10 Best Prompt Engineering Tools in 2026

Built for AI agents.
Break less.
Ship more.

Related articles

Best of
8 Best LLM Evaluation Tools in 2026
Best LLM evaluation tools in 2026: Respan, Braintrust, Langfuse, LangSmith, Promptfoo, DeepEval, Galileo, Patronus. Pricing, features, and when each is the right pick.
Frank Chen · 18 hours ago

Best of
9 Best LLM Observability Tools in 2026
The best LLM observability platforms in 2026: Respan, Langfuse, LangSmith, Helicone, Braintrust, Datadog, Arize Phoenix, Weights & Biases, Galileo. Pricing, features, pros and cons of each.
Frank Chen · 18 hours ago

Best of
10 Best Prompt Engineering Tools in 2026
The best prompt engineering tools in 2026: Respan, PromptLayer, Vellum, LangSmith, Braintrust, Promptfoo, Latitude, Helicone, Pezzo, Continue. Pricing and pros and cons of each.
Frank Chen · 18 hours ago

8 Best LLM Gateways in 2026

Quick comparison

What to evaluate

1. Respan

2. OpenRouter

3. LiteLLM

4. Portkey

5. Cloudflare AI Gateway

6. Helicone

7. Bifrost

8. Vercel AI Gateway

How to choose

Common mistakes

FAQ

Related

Related articles

8 Best LLM Evaluation Tools in 2026

9 Best LLM Observability Tools in 2026

10 Best Prompt Engineering Tools in 2026

Built for AI agents. Break less. Ship more.

8 Best LLM Gateways in 2026

Quick comparison

What to evaluate

1. Respan

2. OpenRouter

3. LiteLLM

4. Portkey

5. Cloudflare AI Gateway

6. Helicone

7. Bifrost

8. Vercel AI Gateway

How to choose

Common mistakes

FAQ

Related

Related articles

8 Best LLM Evaluation Tools in 2026

9 Best LLM Observability Tools in 2026

10 Best Prompt Engineering Tools in 2026

Built for AI agents. Break less. Ship more.

Built for AI agents.
Break less.
Ship more.

Built for AI agents.
Break less.
Ship more.