Disclosure up front: I run developer relations at Respan, so this article is written by someone who is not neutral. Langfuse is the strongest open-source LLM observability project in the ecosystem and I use it on side projects. I will try to be specific about what they do well and where Respan loses to them. The honest answer to "which should you pick" is "it depends on your stack and your tolerance for self-hosting," and I will give you a framework for figuring that out.
We see roughly 80 million LLM requests per day flow through Respan from customer workloads. I have also operated a self-hosted Langfuse instance for a side project for about a year. Both are good products built by serious teams. They aim at overlapping but meaningfully different problems.
TL;DR: when to pick each
| Pick Langfuse if... | Pick Respan if... |
|---|---|
| You want a free, MIT-licensed, self-host-first stack | You want one platform for obs + evals + prompts + gateway |
| Your infra/security team requires source-available observability | You want a built-in LLM gateway with provider fallback (Langfuse does not ship one) |
| You are already on the OSS LLM stack (LiteLLM, vLLM, Ollama) and want a project that matches that worldview | You want managed online evals and prompt A/B testing wired into traces by default |
| You only need observability plus light evals, not a full platform | You want 100% trace capture by default without sampling math |
| You have an engineer who likes operating Postgres and ClickHouse | You want to move fast without owning the data plane |
If you want one sentence: Langfuse is the open-source observability backbone you can self-host for free; Respan is the unified managed platform that bundles observability, evals, prompts, and a gateway in one place. The fork in the road is mostly about self-host posture and whether you want a gateway in the same product.
The two companies, briefly
Langfuse was founded in 2022 by Marc Klingen, Clemens Rawert, and Max Deichmann. They are YC W23 alumni, fully MIT-licensed, and have built one of the strongest OSS communities in the LLM observability space. The product is genuinely free to self-host, the codebase is active, and the docs are good. They sell a managed cloud version with paid tiers on top of the open-source core.
Respan was founded in 2023 by Andy Li, Raymond Huang, and Hendrix Liu. YC W24. The company shipped under the name Keywords AI through 2025 and rebranded to Respan in early 2026 to better reflect the product's scope, which is wider than pure observability. The platform combines LLM observability, evals, prompt management, and an LLM gateway in a single product. Self-hosting is offered only on Enterprise.
The cultural difference matters. Langfuse is "Postgres for LLMOps": an open primitive you can run wherever you want, with a community that values openness above all. Respan is "Vercel for LLMOps": a managed platform that trades self-host openness for breadth, integration, and speed of iteration.
Quick comparison
| Dimension | Respan | Langfuse |
|---|---|---|
| Instrumentation | OpenTelemetry-native + SDK + proxy (3 modes) | OpenTelemetry, SDK (Python/JS), and LiteLLM proxy |
| Tracing | 100% capture by default, agent-trace UI | Full trace tree, session view, strong span detail |
| Evals | Online (LLM-judge + rule) + offline, wired into traces | Offline experiments + scores, online via SDK hooks |
| Prompt management | Versioning, A/B testing, rollback, eval-linked | Versioning, caching, playground, experiments |
| Gateway | Built-in: 500+ models, provider fallback, OpenAI-compatible | Not included (delegated to LiteLLM proxy) |
| Self-host | Enterprise tier only | Free, MIT, Docker Compose / Kubernetes |
| Free tier | Yes, generous for most production starts | Hobby: 50k units/month, 30-day retention |
| Paid entry | Pro tier (usage-based) | Core $29/mo, Pro $199/mo |
| Enterprise | Custom; self-host available | $2,499/mo + Teams; self-host always free |
| License | Closed source (managed) | MIT (core) |
| Community size | Smaller than Langfuse | Largest OSS LLM observability community |
Instrumentation: how you get data in
Both products converge on OpenTelemetry as the primary path. This is the right thing to do; OTel is the only standard worth betting on long-term. The differences are about what wraps that primitive.
Langfuse supports three main entry points. The first is direct OTel via the @langfuse/otel exporter or any OpenTelemetry SDK pointing at their endpoint. The second is the Langfuse Python and JS SDKs, which give you decorators and context managers for manual tracing. The third is logging through LiteLLM's proxy, which is the closest thing the OSS stack has to a gateway. If you already run LiteLLM, Langfuse falls into place naturally.
Respan supports three modes as well, deliberately mirroring how teams actually instrument. OpenTelemetry-native (point any OTel exporter at Respan and it works). The Respan SDK (a unified entry point that wraps OpenInference, OTel, and provider-specific instrumentation). The proxy mode (route requests through Respan's gateway and traces are captured automatically with zero code changes). The proxy path is the fastest "from zero to traces" experience because you just swap the base URL.
The practical difference: if you are LangChain-heavy or running a custom Python agent loop, both are easy. If you want the fastest possible path to traces with no code changes, Respan's gateway-as-instrumentation is hard to beat. If you want vendor neutrality at the instrumentation layer, Langfuse plus OTel is the cleanest story because the receiver is open source.
Tracing UI and data model
Both products show you a trace tree, span detail, token usage, cost, latency, errors, and user/session aggregation. The trace model is the table-stakes layer. Where they diverge is in defaults and in agent-specific features.
Langfuse's trace UI is mature and clean. Sessions, users, traces, observations (their term for spans), scores, datasets. You can pivot from a trace to its scores to its dataset entry without losing context. For pure observability the experience is genuinely good.
Respan's trace UI is built specifically around agent workflows. Multi-step agent runs with tool calls, sub-agent handoffs, retrieval steps, and evals attached at every level. The UI emphasizes the cross-cutting view: a trace shows you the model calls, the retrieval, the tool execution, and the online eval scores all in one place. For straight inference workloads this is overkill. For agent workloads it is what you want.
The biggest defaults difference: Respan captures 100% of traces by default with no sampling math. Langfuse on the managed cloud meters on "units," and you have to think about sampling once you grow past the free tier. Self-hosted Langfuse has no metering, so this only matters on cloud.
Evals: depth vs integration
Langfuse's eval model is dataset-and-experiment first. You build datasets, run experiments against prompts or models, attach custom scores (human or LLM-judge or rule-based), and compare experiment runs. It is a clean offline workflow and similar in shape to what experienced teams already do. Online evals are supported through the SDK; you log scores against traces as they happen.
Respan's eval model is online-first and integrated. Every trace can be scored by configurable LLM-judges or rule-based checks the moment it lands. The judges are versioned, scoped, and can be triggered against subsets of traffic. Offline evals exist too (datasets, runs, comparisons), but the unique thing is that production traffic is constantly being scored, so you see quality regressions in your dashboards rather than only after running an experiment.
If your team runs offline evals as a release gate (load dataset, run prompts, compare scores, ship), Langfuse covers this well. If you want continuous quality measurement on live traffic without writing scoring code yourself, Respan is shaped for that. Most teams want both; the question is which workflow your evals leader prefers as the default.
Honest caveat: Respan is less specialized than Braintrust on the deep offline eval workflow. If you are running rigorous before-and-after model comparisons with bespoke scoring functions every week, neither Respan nor Langfuse will match Braintrust's depth. For most teams, this is fine; the offline workflow you need is more pedestrian than the eval-tool marketing pages suggest.
See LLM evals and how to evaluate an LLM for more.
Prompt management
Both products ship real prompt management, which is more than most observability tools do.
Langfuse has a prompt registry with versioning, caching (so production reads are fast), a playground for iteration, and experiments that tie prompts to datasets and scores. It is well thought out and the caching layer is genuinely useful.
Respan has prompt versioning, A/B testing on live traffic, rollback, and tight coupling to the eval system (you can route a percentage of traffic to a new prompt version and watch online eval scores diverge in real time). The A/B testing piece is where Respan pulls ahead for teams that ship prompt changes weekly.
For a deeper look, see best prompt management tools and what is prompt versioning.
Gateway: the biggest structural difference
This is the cleanest distinction between the two products.
Langfuse does not ship an LLM gateway. They delegate that to LiteLLM, which is a separate open-source project. The integration is solid (LiteLLM logs into Langfuse out of the box), but you operate two products: LiteLLM as the proxy and Langfuse as the observability backend. If you want provider fallback, rate limiting, key management, or model routing, you configure it in LiteLLM.
Respan ships a built-in LLM gateway. 500+ models behind a single OpenAI-compatible endpoint, provider fallback, key management, caching, rate limiting, and load balancing. Because the gateway and the observability live in the same product, traces are populated automatically and routing decisions are visible in the trace itself.
If you already love LiteLLM, the Langfuse setup is fine. If you do not want to operate two products, Respan's bundled gateway is the simpler answer. See what is an LLM gateway and best LLM gateways for the broader landscape.
Self-host story
This is where Langfuse is meaningfully ahead.
Langfuse is MIT-licensed and free to self-host on any plan tier. Docker Compose for a single box. Kubernetes templates for production. The full feature set (traces, prompts, evals, datasets, playground) ships in the OSS image. The graceful side effect is that you can evaluate the product fully without ever talking to sales, and you can move to the managed cloud later if you want to stop operating Postgres and ClickHouse yourself.
Respan offers self-hosting only on the Enterprise tier. Pro is managed cloud only. The Enterprise self-host package includes the full stack (gateway, observability, evals, prompts) in a Kubernetes deployment, but it is not the path for a solo developer or an early-stage team.
For regulated industries, air-gapped environments, or teams with a strict "no SaaS for observability" policy, Langfuse is the obvious answer below Enterprise scale. Respan competes on Enterprise self-host when you also want the gateway and unified platform; Langfuse competes everywhere on free self-host.
Pricing
Both publish pricing. Verified against their pricing pages today.
Langfuse:
- Self-hosted: free, forever, full feature set (MIT)
- Hobby cloud: free, 50k units/month, 30-day retention
- Core: $29/month, 100k units included, $8 per 100k after, 90-day retention
- Pro: $199/month, 100k units included, same overage, 3-year retention, SOC2/HIPAA
- Teams add-on: $300/month (SSO, RBAC)
- Enterprise: $2,499/month base
Respan:
- Free tier exists with generous traces for most production starts
- Pro: usage-based, includes the full platform (observability + evals + prompts + gateway) without per-feature unbundling
- Enterprise: custom, includes self-host
The way to think about this honestly: at small volumes, Langfuse self-hosted is free if you discount your time to zero. At small managed volumes ($29-$200/month range), Langfuse is the cheapest path to a hosted observability product if you do not need a gateway. Respan's unified pricing makes more sense once you stop wanting to glue four products together.
Community and ecosystem
Langfuse wins this dimension clearly. They have the largest open-source LLM observability community in the world, the most GitHub stars in the category, and the broadest set of third-party integrations from the OSS LLM ecosystem (LiteLLM, vLLM, Ollama, LangChain, LlamaIndex, etc.). If you ask in a Discord for "anyone running observability on a local Llama deployment," the answer half the time is Langfuse.
Respan's community is smaller. The Discord is active and the team is responsive (we are still small enough that engineers answer in there directly), but we do not have Langfuse's GitHub footprint and we do not pretend otherwise. If "vibrant OSS community" is a core requirement, Langfuse is the right pick.
How to choose
A decision framework that holds up across the teams I have talked to:
Pick Langfuse if:
- You want a free, self-host-first observability product and your team is happy operating Postgres + ClickHouse
- Your security or compliance team requires source-available observability
- You are already deep on the OSS LLM stack (LiteLLM as gateway, Ollama or vLLM for inference) and want a backbone that matches
- You only need observability and light evals, not a full unified platform
- You want the largest community and the most third-party integrations
Pick Respan if:
- You want one product for observability, evals, prompt management, and an LLM gateway
- You want a managed LLM gateway with provider fallback and don't want to operate LiteLLM separately
- You want continuous online evals on production traffic by default, not as an SDK opt-in
- You want 100% trace capture without thinking about sampling
- You want prompt A/B testing wired into evals out of the box
Pick both if:
- You want to validate an OSS posture with self-hosted Langfuse while running Respan in production for the gateway and unified platform
- This is rare but it happens; the two products do not interfere
Frank's take
If I were starting a side project and wanted free observability, I would deploy Langfuse on a $10/month VPS and never look back for the first six months. The product is good and the price is right.
If I were starting a production application at a real company, I would default to Respan because the gateway alone saves a meaningful amount of operational complexity, and the integrated evals + prompts mean my AI engineer is not stitching four products together. The trade is real: I lose the option of fully owning my data plane until Enterprise, and I am betting on a smaller community.
This is the honest pattern I see in conversations: developers from large eng orgs tend to pick Respan because they want fewer products, fewer SLAs to manage, and the unified data model. Developers from OSS-first cultures and from teams that need self-host for compliance tend to pick Langfuse. Neither group is wrong.
FAQ
Is Respan a fork of Langfuse? No. Respan is a separate codebase built around a different architecture (unified platform including the gateway). Both products use OpenTelemetry as the wire format, which is why their instrumentation looks similar.
Can I migrate from Langfuse to Respan? Yes. Both products accept OpenTelemetry, so the SDK changes are minimal. The harder work is migrating historical traces (no automated tool ships today) and re-creating prompts and datasets. Most teams that switch do so for the gateway and the unified eval workflow.
Is Langfuse really free if I self-host? Yes. The full product is MIT-licensed. The cost is operational: you run Postgres, ClickHouse, Redis, and the Langfuse web/worker pods. Budget an engineer-day per month to keep it healthy, more if you run at high volume.
Does Respan have an open-source SDK? The Respan SDK is open source. The platform itself is closed source, except for the Enterprise self-host distribution, which is delivered as a Kubernetes package.
Does Langfuse have an LLM gateway? Not directly. They integrate with LiteLLM, which is a separate open-source project that acts as a gateway. If you want a built-in gateway in the same product, Respan ships one; Langfuse does not.
Which has better evals? Different shapes. Langfuse leans offline experiments and datasets. Respan leans online evals on production traffic with offline workflows attached. Braintrust beats both on rigorous offline eval depth. For most production teams the difference is workflow preference, not capability.
Can I use both? Technically yes (point your OTel exporters at both endpoints) but very few teams do this in practice. Pick one and commit.