How does pulling a prompt at runtime work?

The Respan SDK fetches the active version of a prompt by name using your project API key. The result is cached in-process with a configurable TTL (default: 60 seconds), so runtime overhead is typically under 5ms. The cache is invalidated when you publish a new version.

Can I use variable substitution in prompts?

Yes. Prompts support template variables using double-brace syntax (e.g. {{user_name}}, {{context}}). Variables are defined when you create the prompt and filled in at runtime when you call the SDK. The playground lets you preview renders with test values.

How does A/B testing work?

You define a traffic split between two or more named prompt versions (e.g. 70% v3, 30% v4). The SDK automatically routes each request according to the split. Both variants are logged and tagged, so you can compare quality and engagement metrics across variants in the Respan dashboard.

What happens if the Respan SDK can't reach the API?

The SDK falls back to the last cached version. If no cached version is available, it falls back to a default prompt you specify at initialization. This prevents a network issue from causing prompt-related failures in production.

Can I link prompt versions to eval results?

Yes. Every eval run in Respan can be tagged with the prompt version it used. You can see quality scores for each version before promoting it, and track how scores change as you iterate - making prompt optimization data-driven instead of intuition-based.

Is there an approval workflow before publishing?

Yes, on team plans. You can require a second reviewer to approve changes before they go live in production environments. Changes are staged in a 'pending' state and only promoted after approval.

Can I manage prompts programmatically?

Yes. The Respan API exposes full CRUD for prompts, versions, and deployments. You can manage prompts from CI/CD pipelines, import from existing codebases, or sync from version control using the REST API or Python SDK.

Prompt Optimization

Version, deploy, and A/B test prompts without touching code - so your team can iterate on AI behavior at the speed of a product change, not an engineering deployment.

Start free View docs

Trusted in production

No deploy

required to update a prompt

Full history

with author and timestamp

<5ms

runtime fetch with caching

A/B test

any two prompt variants

What breaks when prompts live in code

Most teams start with prompts as strings in source files. It works until it doesn't - and then it breaks in ways that are hard to diagnose and even harder to fix quickly.

✗ Prompts are buried in code with no change history

When a prompt is a string in a Python file, there's no record of what it was last week, who changed it, or why. Auditing what actually ran in production is impossible.

✗ Every prompt tweak requires a code deployment

Iterating on a system prompt means opening a PR, waiting for CI, and deploying - just to test one sentence change. The feedback loop is days long when it should be minutes.

✗ Rolling back a bad prompt means rolling back code

If a prompt change degrades quality in production, reverting it means reverting the entire deployment - potentially undoing unrelated changes that were safe to ship.

✗ There's no way to A/B test prompt variants in production

Without traffic splitting at the prompt layer, you can't run a controlled experiment. You ship one version at a time and compare results across different time periods - not a fair test.

✗ Multiple people editing prompts leads to conflicts and lost work

Prompts in code repos get overwritten during merges. There's no awareness of who's editing what, no conflict detection, and no review process before a change goes live.

What you get

A prompt registry that your whole team can use - with the versioning, deployment controls, and runtime SDK your engineering team expects.

→Store and version prompts in a central registry - full change history with author and timestamp
→Deploy prompt updates to production without a code deployment or redeployment
→Roll back to any previous prompt version with one click - no code changes required
→Run A/B tests between prompt variants with configurable traffic splits in production
→Pull the active prompt at runtime via the SDK - cached with sub-5ms overhead
→Define input variables and preview rendered prompts in the built-in playground
→Track which prompt version is active in each environment (dev, staging, production)
→Add team members with role-based access - control who can edit, review, and deploy
→Link prompt versions to eval run results - see quality scores before promoting to production
→Organize prompts by project, model, use case, and team with tags and search

How it works

Prompts are stored in Respan as named, versioned objects. Your application fetches the active version at runtime via the SDK - a cached call under 5ms. When you publish a new version in the dashboard, the cache TTL expires and the new version is served automatically. No code change, no deployment, no downtime.

Store the prompt

Create a prompt in the Respan dashboard. Give it a name, define variables, and publish the first version.

Pull at runtime

Replace the hardcoded string in your code with a one-line SDK call. The active version is fetched and cached automatically.

Edit in the UI

Update the prompt in the dashboard, preview it with test values in the playground, and publish - no code change required.

A/B test or roll back

Split traffic between versions to compare quality, or roll back to any previous version with one click if something breaks.

import openai
from respan import RespanClient

respan = RespanClient(api_key="YOUR_RESPAN_KEY")
openai_client = openai.OpenAI(api_key="sk-...")

# Before: hardcoded prompt in your code
# system_prompt = "You are a helpful customer support agent..."

# After: pull the active version from Respan at runtime
prompt = respan.prompts.get("customer-support-system")

# Variables are filled in at call time
rendered = prompt.render(
    product_name="Acme Pro",
    tone="friendly"
)

response = openai_client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": rendered},
        {"role": "user", "content": user_message},
    ]
)

# The prompt is cached locally with a 60s TTL.
# When you publish a new version in the dashboard,
# the next cache miss fetches it automatically.

Who uses this and how

Product-led iteration

If your product team wants to improve the AI's tone without waiting for engineering: they edit the prompt in the dashboard and publish - no PR, no deploy, live in seconds.

Fast debugging

If a system prompt is causing bad outputs in production: roll back to the previous version immediately, stabilize, then investigate and fix without pressure.

Controlled A/B tests

If you're testing whether a more structured system prompt improves response quality: split 20% of traffic to the new version, collect logs, run evals, and decide with data.

Multi-environment management

If you run dev, staging, and production environments: deploy different prompt versions to each environment independently - test in staging before promoting to production.

Compliance and audit

If your organization needs to audit AI behavior: every prompt version has an immutable record of who authored it, when it was deployed, and what traffic it served.

Works with your stack

SDK support

Python SDK
TypeScript / JavaScript SDK
REST API (any language)
OpenAI SDK integration
LangChain integration
LlamaIndex integration

Works with any model

OpenAI GPT-4o / o1
Anthropic Claude
Google Gemini
Groq
Mistral
Any chat-completion model

Deployment targets

Production environment
Staging environment
Development environment
Multiple regions
Edge functions (Vercel, Cloudflare)
Self-hosted deployments

Prompts in code vs Respan Prompt Optimization

ConcernPrompts in codeWith Respan

Change historygit blame on a string constantFull version history with author, timestamp, and notes

Iteration speedPR → CI → deploy → waitEdit in UI → publish - live in seconds

RollbackRevert code deploy, affects all changesOne-click version rollback, prompt-only

A/B testingFeature flags in code, manual routing logicTraffic split at the prompt layer, no code

Runtime fetchString in code, requires redeploy to changeSDK pull with caching, always current

CollaborationGit merges, overwritten changes, no reviewRole-based access, edit locks, deploy approvals

Why not just use environment variables?

Environment variables solve the deployment coupling problem but nothing else. There's no version history, no diff view, no playground, no A/B testing, no eval linkage, and no approval workflow. You also can't update an env var in production without restarting the process or redeploying. Respan gives you all of the above, plus runtime fetch so changes are live without any restart.

Frequently asked questions

Also in Respan

Tracing →Metrics →Evaluations →AI gateway →

Built for AI agents.
Break less.
Ship more.

Start for free Get a demo

What breaks when prompts live in code

Most teams start with prompts as strings in source files. It works until it doesn't - and then it breaks in ways that are hard to diagnose and even harder to fix quickly.

✗ Prompts are buried in code with no change history

When a prompt is a string in a Python file, there's no record of what it was last week, who changed it, or why. Auditing what actually ran in production is impossible.

✗ Every prompt tweak requires a code deployment

Iterating on a system prompt means opening a PR, waiting for CI, and deploying - just to test one sentence change. The feedback loop is days long when it should be minutes.

✗ Rolling back a bad prompt means rolling back code

If a prompt change degrades quality in production, reverting it means reverting the entire deployment - potentially undoing unrelated changes that were safe to ship.

✗ There's no way to A/B test prompt variants in production

Without traffic splitting at the prompt layer, you can't run a controlled experiment. You ship one version at a time and compare results across different time periods - not a fair test.

✗ Multiple people editing prompts leads to conflicts and lost work

Prompts in code repos get overwritten during merges. There's no awareness of who's editing what, no conflict detection, and no review process before a change goes live.

What you get

A prompt registry that your whole team can use - with the versioning, deployment controls, and runtime SDK your engineering team expects.

→Store and version prompts in a central registry - full change history with author and timestamp

→Deploy prompt updates to production without a code deployment or redeployment

→Roll back to any previous prompt version with one click - no code changes required

→Run A/B tests between prompt variants with configurable traffic splits in production

→Pull the active prompt at runtime via the SDK - cached with sub-5ms overhead

→Define input variables and preview rendered prompts in the built-in playground

→Track which prompt version is active in each environment (dev, staging, production)

→Add team members with role-based access - control who can edit, review, and deploy

→Link prompt versions to eval run results - see quality scores before promoting to production

→Organize prompts by project, model, use case, and team with tags and search

How it works

Store the prompt

Create a prompt in the Respan dashboard. Give it a name, define variables, and publish the first version.

Pull at runtime

Replace the hardcoded string in your code with a one-line SDK call. The active version is fetched and cached automatically.

Edit in the UI

Update the prompt in the dashboard, preview it with test values in the playground, and publish - no code change required.

A/B test or roll back

Split traffic between versions to compare quality, or roll back to any previous version with one click if something breaks.

import openai from respan import RespanClient respan = RespanClient(api_key="YOUR_RESPAN_KEY") openai_client = openai.OpenAI(api_key="sk-...") # Before: hardcoded prompt in your code # system_prompt = "You are a helpful customer support agent..." # After: pull the active version from Respan at runtime prompt = respan.prompts.get("customer-support-system") # Variables are filled in at call time rendered = prompt.render( product_name="Acme Pro", tone="friendly" ) response = openai_client.chat.completions.create( model="gpt-4o", messages=[ {"role": "system", "content": rendered}, {"role": "user", "content": user_message}, ] ) # The prompt is cached locally with a 60s TTL. # When you publish a new version in the dashboard, # the next cache miss fetches it automatically.

Who uses this and how

Product-led iteration

If your product team wants to improve the AI's tone without waiting for engineering: they edit the prompt in the dashboard and publish - no PR, no deploy, live in seconds.

Fast debugging

If a system prompt is causing bad outputs in production: roll back to the previous version immediately, stabilize, then investigate and fix without pressure.

Controlled A/B tests

If you're testing whether a more structured system prompt improves response quality: split 20% of traffic to the new version, collect logs, run evals, and decide with data.

Multi-environment management

If you run dev, staging, and production environments: deploy different prompt versions to each environment independently - test in staging before promoting to production.

Compliance and audit

If your organization needs to audit AI behavior: every prompt version has an immutable record of who authored it, when it was deployed, and what traffic it served.

Works with your stack

SDK support

Python SDK
TypeScript / JavaScript SDK
REST API (any language)
OpenAI SDK integration
LangChain integration
LlamaIndex integration

Works with any model

OpenAI GPT-4o / o1
Anthropic Claude
Google Gemini
Groq
Mistral
Any chat-completion model

Deployment targets

Production environment
Staging environment
Development environment
Multiple regions
Edge functions (Vercel, Cloudflare)
Self-hosted deployments

Prompts in code vs Respan Prompt Optimization

ConcernPrompts in codeWith Respan

Change historygit blame on a string constantFull version history with author, timestamp, and notes

Iteration speedPR → CI → deploy → waitEdit in UI → publish - live in seconds

RollbackRevert code deploy, affects all changesOne-click version rollback, prompt-only

A/B testingFeature flags in code, manual routing logicTraffic split at the prompt layer, no code

Runtime fetchString in code, requires redeploy to changeSDK pull with caching, always current

CollaborationGit merges, overwritten changes, no reviewRole-based access, edit locks, deploy approvals

Why not just use environment variables?

Frequently asked questions

Prompt Optimization

Trusted in production

What breaks when prompts live in code

What you get

How it works

Who uses this and how

Works with your stack

Prompts in code vs Respan Prompt Optimization

Why not just use environment variables?

Frequently asked questions

Frequently asked questions

How does pulling a prompt at runtime work?

Can I use variable substitution in prompts?

How does A/B testing work?

What happens if the Respan SDK can't reach the API?

Can I link prompt versions to eval results?

Is there an approval workflow before publishing?

Can I manage prompts programmatically?

Also in Respan

Built for AI agents. Break less. Ship more.

Prompt Optimization

Trusted in production

What breaks when prompts live in code

What you get

How it works

Who uses this and how

Works with your stack

Prompts in code vs Respan Prompt Optimization

Why not just use environment variables?

Frequently asked questions

Frequently asked questions

How does pulling a prompt at runtime work?

Can I use variable substitution in prompts?

How does A/B testing work?

What happens if the Respan SDK can't reach the API?

Can I link prompt versions to eval results?

Is there an approval workflow before publishing?

Can I manage prompts programmatically?

Also in Respan

Built for AI agents. Break less. Ship more.

Built for AI agents.
Break less.
Ship more.

Built for AI agents.
Break less.
Ship more.