Open Source
Free
- Full platform access
- Self-hosted
- Apache 2.0 license
- All GenAI features included
MLflow is the leading open-source platform for managing the end-to-end machine learning lifecycle, now expanded into a comprehensive GenAI engineering platform. Created by Matei Zaharia (also the creator of Apache Spark) at Databricks in 2018 and donated to the Linux Foundation in 2020, MLflow has grown to over 20,000 GitHub stars and 60 million monthly downloads, making it one of the most widely adopted ML tools in the world.
With the release of MLflow 3.0 in June 2025, the platform underwent a major pivot to become a unified AI engineering platform for agents, LLMs, and ML models. The GenAI capabilities include OpenTelemetry-compatible tracing for LLM observability, 50+ built-in evaluation metrics with LLM-as-judge support, prompt versioning and optimization, and a built-in AI Gateway providing unified API access to all major LLM providers with rate limiting and cost control. The platform auto-traces 50+ AI frameworks including OpenAI, Anthropic, LangChain, LlamaIndex, and DSPy.
MLflow is used by over 19,000 companies globally, including Fortune 500 organizations like Amazon, Microsoft, Google, and BNP Paribas. While it is 100% free and open source under the Apache 2.0 license, Databricks offers a fully managed MLflow experience integrated into their cloud data platform. MLflow's unique strength is combining traditional MLOps capabilities (experiment tracking, model registry, deployment) with modern GenAI observability — something no other tool in the category offers.
Core capabilities this platform advertises.
What this tool does well, and the limitations to keep in mind.
Pros
Cons
What's included in each plan, and how the tiers compare.
Free
$0.40/DBU
Usage-based
$0.55/DBU
Usage-based
$0.65/DBU
Usage-based
ML engineers and AI teams, especially those in the Databricks ecosystem
MLflow and Respan complement each other in the AI observability stack. While MLflow provides experiment tracking, model registry, and GenAI tracing for development workflows, Respan adds production-grade LLM gateway management, cost optimization, and real-time monitoring. Teams can use MLflow for development-time evaluation and Respan for production observability.
Top companies in Observability, Prompts & Evals you can use instead of MLflow.
Respan
LLM tracing, evals, and gateway
LangSmith
Trace visualization for LLM chains
Weights & Biases
ML experiment tracking
Langfuse
Open-source LLM observability
Arize AI
ML observability with LLM support
Helicone
Traceloop
OpenTelemetry
Datadog LLM
LLM monitoring within Datadog platform
Braintrust
Real-time LLM logging and tracing
HoneyHive
Prompt management
Patronus AI
Automated LLM evaluation platform
Phoenix
OpenTelemetry-based LLM and agent tracing
Promptfoo
Portkey
Humanloop
Sentry
DeepEval
Ragas
RAG-specific evaluation framework
LangWatch
Multi-turn agent simulation testing
Galileo AI
LLM output quality evaluation
PromptLayer
Maxim AI
Distributed tracing for LLM and agent apps
Confident AI
DeepEval open-source evaluation framework
Opik
Agenta
Future AGI
Multimodal evaluation (text, image, audio, video)
Lunary
Parea AI
Moda
Hallucination detection
Ashr
Multi-modal synthetic testing
Sentrial
Agent failure root cause analysis
Athina AI
Chamber
ML infrastructure automation
Side-by-side comparisons with other tools in this category.
Companies from adjacent layers in the AI stack that work well with MLflow.
Last verified: March 27, 2026