Ollama (tracing)

Ollama lets you run large language models locally. respan-instrumentation-ollama instruments the official Ollama Python client and emits chat, generation, and embedding spans into the Respan tracing pipeline.

  1. Sign up - Create an account at platform.respan.ai
  2. Create an API key - Generate one on the API keys page

See Ollama gateway setup to route Ollama model calls through the Respan gateway.

Setup

1

Install packages

$pip install respan-ai respan-instrumentation-ollama ollama python-dotenv

ollama is the official Ollama Python client. A running Ollama server is required for real model calls.

2

Set environment variables

$export RESPAN_API_KEY="YOUR_RESPAN_API_KEY"

Optional:

$export RESPAN_BASE_URL="https://api.respan.ai/api"
$export OLLAMA_HOST="http://localhost:11434"
$export OLLAMA_MODEL="llama3.2"
3

Initialize and run

1import os
2
3from dotenv import load_dotenv
4from ollama import Client
5from respan import Respan, workflow
6from respan_instrumentation_ollama import OllamaInstrumentor
7
8load_dotenv()
9
10respan = Respan(
11 api_key=os.environ["RESPAN_API_KEY"],
12 base_url=os.getenv("RESPAN_BASE_URL", "https://api.respan.ai/api"),
13 instrumentations=[OllamaInstrumentor()],
14)
15client = Client(host=os.getenv("OLLAMA_HOST"))
16
17
18@workflow(name="ollama_chat")
19def run_chat() -> str:
20 response = client.chat(
21 model=os.getenv("OLLAMA_MODEL", "llama3.2"),
22 messages=[{"role": "user", "content": "Reply with one concise sentence."}],
23 )
24 return response["message"]["content"]
25
26
27try:
28 print(run_chat())
29finally:
30 respan.flush()
31 respan.shutdown()
4

View your trace

Open the Traces page and search for the workflow name ollama_chat.

Configuration

OllamaInstrumentor() does not require additional arguments. It patches the official Ollama Client and AsyncClient chat, generate, embed, and embeddings methods.

Attributes

Attach customer identifiers, thread IDs, workflow names, and metadata to Ollama calls with propagate_attributes.

1from respan import propagate_attributes
2
3with propagate_attributes(
4 customer_identifier="user_123",
5 thread_identifier="conversation_456",
6 trace_group_identifier="ollama_support_chat.workflow",
7 metadata={"plan": "pro", "workflow_name": "ollama_support_chat.workflow"},
8):
9 response = client.chat(
10 model="llama3.2",
11 messages=[{"role": "user", "content": "Summarize our support policy."}],
12 )
AttributeTypeDescription
customer_identifierstrIdentifies the end user in Respan analytics.
thread_identifierstrGroups related messages into a conversation.
trace_group_identifierstrGroups spans by workflow name.
metadatadictCustom key-value pairs merged with default metadata.

Examples

Streaming generation

1stream = client.generate(
2 model="llama3.2",
3 prompt="Write a short haiku about observability.",
4 stream=True,
5)
6
7for chunk in stream:
8 print(chunk["response"], end="", flush=True)

Embeddings

1response = client.embed(
2 model="nomic-embed-text",
3 input=["Respan records traces for AI systems."],
4)
5print(len(response["embeddings"][0]))