OpenAI SDK | Respan Docs

Overview

respan-instrumentation-openai is an instrumentation plugin that auto-traces all OpenAI SDK calls (ChatCompletion, Completion, Embedding) using the opentelemetry-instrumentation-openai library. It also applies a sync prompt-capture patch to ensure input messages are reliably captured.

$ pip install respan-instrumentation-openai

Version: 1.0.0 | Python: >=3.11, <3.14

Dependencies

Package	Version
`respan-tracing`	>=2.3.0
`openai`	>=1.0.0
`opentelemetry-instrumentation-openai`	>=0.48.0

Quick start

1 from respan import Respan
2 from respan_instrumentation_openai import OpenAIInstrumentor
3 from openai import OpenAI
4 
5 respan = Respan(
6     api_key="your-api-key",
7     instrumentations=[OpenAIInstrumentor()],
8 )
9 
10 client = OpenAI()
11 
12 # All OpenAI calls are automatically traced
13 response = client.chat.completions.create(
14     model="gpt-4o-mini",
15     messages=[{"role": "user", "content": "Hello!"}],
16 )
17 print(response.choices[0].message.content)

Public API

OpenAIInstrumentor

The main instrumentor class. Implements the Instrumentation protocol.

1 from respan_instrumentation_openai import OpenAIInstrumentor

Attribute/Method	Type	Description
`name`	`str`	`"openai"` — unique plugin identifier.
`activate()`	`() -> None`	Instruments the OpenAI SDK via OTEL and applies the sync prompt-capture patch.
`deactivate()`	`() -> None`	Uninstruments the OpenAI SDK.

activate()

When called, activate():

Imports opentelemetry.instrumentation.openai.OpenAIInstrumentor (the OTEL community instrumentor).
Calls instrumentor.instrument() if not already instrumented.
Applies a sync prompt-capture patch (see below).
Sets self._instrumented = True.

Calling activate() multiple times is safe (idempotent).

deactivate()

Calls OpenAIInstrumentor.uninstrument() on the OTEL instrumentor and resets the internal state.

What gets captured

The OTEL OpenAI instrumentor automatically creates spans for:

Operation	Span Name Pattern	Attributes
`chat.completions.create()`	`openai.chat`	`gen_ai.system`, `gen_ai.request.model`, prompt/completion tokens, input messages, output
`completions.create()`	`openai.completion`	Same as above
`embeddings.create()`	`openai.embeddings`	`gen_ai.system`, `gen_ai.request.model`, token counts

Span attributes

All spans carry standard GenAI semantic conventions:

Attribute	Description
`gen_ai.system`	`"openai"`
`gen_ai.request.model`	Model name (e.g. `"gpt-4o-mini"`)
`gen_ai.usage.prompt_tokens`	Input token count
`gen_ai.usage.completion_tokens`	Output token count
`gen_ai.prompt.0.role`	Role of the i-th input message
`gen_ai.prompt.0.content`	Content of the i-th input message
`gen_ai.completion.0.role`	Role of the i-th output message
`gen_ai.completion.0.content`	Content of the i-th output message

The EnrichedSpan proxy in the exporter automatically injects llm.request.type="chat" for GenAI spans that carry gen_ai.system but lack llm.request.type. This ensures the Respan backend correctly parses prompt/completion data.

Sync prompt-capture patch

Problem

opentelemetry-instrumentation-openai v0.52+ has an async def _handle_request() for the chat wrapper. In synchronous contexts, this runs through asyncio.run() or spawns a thread, which can silently lose prompt attributes when:

_set_request_attributes raises on response_format handling, killing the entire _handle_request before _set_prompts runs.
asyncio.run() / thread path has environment-specific issues (Lambda, Docker, etc.).

Fix

The plugin replaces the async _handle_request with a synchronous version that has fault isolation between sections:

Request attributes (model, temperature, etc.) — errors are caught and logged, don’t block prompts.
Prompt capture — runs synchronously with inline content (no base64 upload).
Trace propagation — reasoning effort and context propagation.

This ensures prompts always appear on the dashboard, even if other attribute sections fail.

Combined with decorators

1 from respan import Respan, workflow, task
2 from respan_instrumentation_openai import OpenAIInstrumentor
3 from openai import OpenAI
4 
5 respan = Respan(
6     api_key="your-api-key",
7     instrumentations=[OpenAIInstrumentor()],
8 )
9 client = OpenAI()
10 
11 @task(name="generate_summary")
12 def generate_summary(text: str) -> str:
13     response = client.chat.completions.create(
14         model="gpt-4o-mini",
15         messages=[
16             {"role": "system", "content": "Summarize the following text."},
17             {"role": "user", "content": text},
18         ],
19     )
20     return response.choices[0].message.content
21 
22 @workflow(name="summarization_pipeline")
23 def summarize_documents(docs: list[str]) -> list[str]:
24     return [generate_summary(doc) for doc in docs]

With propagated attributes

1 from respan import Respan, propagate_attributes
2 from respan_instrumentation_openai import OpenAIInstrumentor
3 
4 respan = Respan(
5     api_key="your-api-key",
6     instrumentations=[OpenAIInstrumentor()],
7 )
8 
9 def handle_request(user_id: str, message: str):
10     with propagate_attributes(
11         customer_identifier=user_id,
12         metadata={"source": "api"},
13     ):
14         response = client.chat.completions.create(
15             model="gpt-4o-mini",
16             messages=[{"role": "user", "content": message}],
17         )
18         return response.choices[0].message.content

vs auto-instrumentation

respan-tracing includes built-in auto-instrumentation for OpenAI (via the Instruments.OPENAI enum). The difference:

Feature	Auto-instrumentation	This plugin
Setup	Automatic with `RespanTelemetry()`	Explicit with `Respan(instrumentations=[...])`
Prompt capture	May fail in sync contexts	Patched for reliability
Control	Enabled/disabled via `instruments`/`block_instruments`	Activate/deactivate per-instance
Recommended for	Simple setups	Production applications needing reliable prompt capture

Architecture

openai.ChatCompletion.create()
  │
  ▼
opentelemetry-instrumentation-openai (OTEL community lib)
  │  + sync prompt-capture patch (from this plugin)
  │
  ▼
Creates standard OTEL Span with gen_ai.* attributes
  │
  ▼
RespanSpanProcessor.on_end()
  │  filters via is_processable_span()
  │  enriches: entity_path, propagated attrs
  │
  ▼
RespanSpanExporter.export()
  │  EnrichedSpan: injects llm.request.type="chat"
  │  Converts to OTLP JSON
  │
  ▼
POST /v2/traces → Respan backend