Ollama

Trace Ollama local LLM calls with Respan.
  1. Sign up — Create an account at platform.respan.ai
  2. Create an API key — Generate one on the API keys page
  3. Add credits or a provider key — Add credits on the Credits page or connect your own provider key on the Integrations page

Add the Docs MCP to your AI coding tool to get help building with Respan. No API key needed.

1{
2 "mcpServers": {
3 "respan-docs": {
4 "url": "https://docs.respan.ai/mcp"
5 }
6 }
7}

What is Ollama?

Ollama lets you run large language models locally. It provides a simple CLI and API for downloading, running, and managing models like Llama, Mistral, and Gemma on your own hardware.

Setup

1

Install packages

$pip install respan-ai opentelemetry-instrumentation-ollama ollama python-dotenv
2

Set environment variables

$export RESPAN_API_KEY="YOUR_RESPAN_API_KEY"
$export OTEL_EXPORTER_OTLP_ENDPOINT="https://api.respan.ai/api"
$export OTEL_EXPORTER_OTLP_HEADERS="Authorization=Bearer $RESPAN_API_KEY"
3

Initialize and run

1import os
2from dotenv import load_dotenv
3
4load_dotenv()
5
6from ollama import Client
7from respan import Respan
8
9# Auto-discover and activate all installed instrumentors
10respan = Respan(is_auto_instrument=True)
11
12# Calls to Ollama are auto-traced by Respan
13client = Client()
14
15response = client.chat(
16 model="llama3.1",
17 messages=[{"role": "user", "content": "Say hello in three languages."}],
18)
19print(response["message"]["content"])
20respan.flush()
4

View your trace

Open the Traces page to see your auto-instrumented LLM spans.

What gets traced

  • Model name and version
  • Prompt and completion tokens
  • Input/output content
  • Response latency
  • Streaming chunks

Traces appear in the Traces dashboard.

Learn more