For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
DiscordPlatform
DocumentationIntegrationsAPI referenceSDKsChangelog
DocumentationIntegrationsAPI referenceSDKsChangelog
    • Overview
  • Tracing
  • Gateway
      • OpenAI SDK
      • Instructor
      • Anthropic SDK
      • Google GenAI
      • LiteLLM
      • RubyLLM
      • Vertex AI
      • AWS Bedrock
      • Cohere
      • Groq
      • Mistral AI
      • Ollama
      • Watsonx
      • Together AI
      • Aleph Alpha
      • HuggingFace
      • Replicate
      • SageMaker
      • Respan API
  • Others
  • Migrating
    • Braintrust
    • Portkey
    • Langfuse
LogoLogo
DiscordPlatform
On this page
  • Setup
  • Switch models
  • Streaming
  • Multi-tenancy with contexts
  • Rails integration
GatewayLLM SDKs

RubyLLM (gateway)

Was this page helpful?
Previous

Vertex AI (gateway)

Next
Built with
RubyLLM does not have a Ruby-side tracing instrumentor. Route all calls through the Respan gateway to capture every request as a trace.

Setup

1

Install RubyLLM

$gem install ruby_llm

Or add it to your Gemfile.

1gem "ruby_llm"
2

Set environment variables

$export RESPAN_API_KEY="YOUR_RESPAN_API_KEY"

No provider key needed — the Respan gateway handles provider authentication.

3

Configure RubyLLM with Respan

1RubyLLM.configure do |config|
2 config.openai_api_key = ENV["RESPAN_API_KEY"]
3 config.openai_api_base = "https://api.respan.ai/api"
4end
4

Make your first request

1chat = RubyLLM.chat(model: "gpt-4.1-nano")
2response = chat.ask("Hello, world!")
3puts response.content
5

View your trace

Open the Traces page to see your gateway-routed calls with prompts, tokens, and cost.

Switch models

For OpenAI models it works directly. For non-OpenAI models (Claude, Gemini, etc.), add provider: :openai and assume_model_exists: true to route them through the Respan gateway.

1chat = RubyLLM.chat(model: "gpt-4.1-nano")
2chat = RubyLLM.chat(model: "claude-3-5-haiku-20241022", provider: :openai, assume_model_exists: true)
3chat = RubyLLM.chat(model: "gemini-2.0-flash", provider: :openai, assume_model_exists: true)
4
5response = chat.ask("Tell me about artificial intelligence")
6puts response.content

provider: :openai doesn’t mean the model is from OpenAI — it tells RubyLLM to use the OpenAI API protocol to send the request. Without it, RubyLLM would call the provider directly, bypassing Respan. assume_model_exists: true skips RubyLLM’s local model registry check.

See the full model list.

Streaming

1chat = RubyLLM.chat(model: "gpt-4.1-nano")
2chat.ask("Explain quantum computing") do |chunk|
3 print chunk.content
4end

Multi-tenancy with contexts

Use RubyLLM contexts to isolate per-tenant configuration.

1tenant_ctx = RubyLLM.context do |config|
2 config.openai_api_key = tenant.respan_api_key
3 config.openai_api_base = "https://api.respan.ai/api"
4end
5
6chat = tenant_ctx.chat(model: "gpt-4.1-nano")
7response = chat.ask("Hello!")

Rails integration

Set your Respan config in an initializer.

1# config/initializers/ruby_llm.rb
2RubyLLM.configure do |config|
3 config.openai_api_key = ENV["RESPAN_API_KEY"]
4 config.openai_api_base = "https://api.respan.ai/api"
5end

Use acts_as_chat as normal — all LLM calls will be routed through Respan.