Together AI (tracing)

The Together AI Python SDK is the official client for Together’s inference platform. respan-instrumentation-together patches the generated Together SDK resources and emits Respan spans for direct Together AI calls.

Set up Respan

Sign up - Create an account at platform.respan.ai
Create an API key - Generate one on the API keys page

Use Respan Gateway

See Together AI gateway setup to route Together AI model calls through the Respan gateway.

Example projects

Python examples

Setup

Install packages

$ pip install respan-ai respan-instrumentation-together together python-dotenv

Set environment variables

$ export RESPAN_API_KEY="YOUR_RESPAN_API_KEY"
$ export TOGETHER_API_KEY="YOUR_TOGETHER_API_KEY"

Optional:

$ export TOGETHER_MODEL="meta-llama/Llama-3.2-3B-Instruct-Turbo"

TOGETHER_API_KEY is used by the Together SDK for direct provider calls. RESPAN_API_KEY is used by Respan for trace export.

Initialize and run

1 import os
2 
3 from dotenv import load_dotenv
4 from respan import Respan, workflow
5 from respan_instrumentation_together import TogetherInstrumentor
6 from together import Together
7 
8 load_dotenv()
9 
10 respan = Respan(
11     api_key=os.environ["RESPAN_API_KEY"],
12     instrumentations=[TogetherInstrumentor()],
13 )
14 client = Together(api_key=os.environ["TOGETHER_API_KEY"])
15 
16 
17 @workflow(name="together_chat_completion")
18 def run_chat() -> str:
19     response = client.chat.completions.create(
20         model=os.getenv("TOGETHER_MODEL", "meta-llama/Llama-3.2-3B-Instruct-Turbo"),
21         messages=[
22             {
23                 "role": "user",
24                 "content": "Reply with one concise sentence about tracing.",
25             }
26         ],
27     )
28     return response.choices[0].message.content or ""
29 
30 
31 try:
32     print(run_chat())
33 finally:
34     respan.flush()
35     respan.shutdown()

View your trace

Open the Traces page and search for the workflow name together_chat_completion.

Configuration

Parameter	Type	Default	Description
`api_key`	`str \| None`	`None`	Respan API key. Falls back to `RESPAN_API_KEY`.
`base_url`	`str \| None`	`None`	Respan API base URL. Falls back to `RESPAN_BASE_URL`.
`instrumentations`	`list`	`[]`	Plugin instrumentations to activate, such as `TogetherInstrumentor()`.
`customer_identifier`	`str \| None`	`None`	Default customer identifier for all exported spans.
`metadata`	`dict \| None`	`None`	Default metadata attached to all exported spans.
`environment`	`str \| None`	`None`	Environment tag, such as `"production"`.

Supported calls

SDK call	Traced
`client.chat.completions.create(...)`	Chat completion spans
`client.chat.completions.create(..., stream=True)`	Streaming chat completion spans after the stream is consumed or closed
`await async_client.chat.completions.create(...)`	Async chat completion spans
`client.completions.create(...)`	Text completion spans
`client.embeddings.create(...)`	Embedding spans with vector payloads summarized
`client.rerank.create(...)`	Rerank spans
`client.images.generate(...)`	Image generation spans
`tools=[...]` / `tool_calls` responses	Tool definitions and assistant tool calls

Attributes

Attach customer identifiers, thread IDs, workflow names, and metadata to Together AI calls with propagate_attributes.

1 from respan import propagate_attributes
2 
3 with propagate_attributes(
4     customer_identifier="user_123",
5     thread_identifier="conversation_456",
6     trace_group_identifier="together_support_chat",
7     metadata={"plan": "pro", "workflow_name": "together_support_chat"},
8 ):
9     response = client.chat.completions.create(
10         model="meta-llama/Llama-3.2-3B-Instruct-Turbo",
11         messages=[{"role": "user", "content": "Summarize our support policy."}],
12     )

Attribute	Type	Description
`customer_identifier`	`str`	Identifies the end user in Respan analytics.
`thread_identifier`	`str`	Groups related messages into a conversation.
`trace_group_identifier`	`str`	Groups spans by workflow name.
`metadata`	`dict`	Custom key-value pairs merged with default metadata.

Examples

Streaming chat

1 stream = client.chat.completions.create(
2     model="meta-llama/Llama-3.2-3B-Instruct-Turbo",
3     messages=[{"role": "user", "content": "Write a short haiku about traces."}],
4     stream=True,
5 )
6 
7 for chunk in stream:
8     content = chunk.choices[0].delta.content
9     if content:
10         print(content, end="", flush=True)

Async chat

1 import asyncio
2 import os
3 
4 from together import AsyncTogether
5 
6 
7 async def main() -> None:
8     client = AsyncTogether(api_key=os.environ["TOGETHER_API_KEY"])
9     response = await client.chat.completions.create(
10         model="meta-llama/Llama-3.2-3B-Instruct-Turbo",
11         messages=[{"role": "user", "content": "Explain async tracing briefly."}],
12     )
13     print(response.choices[0].message.content)
14     await client.close()
15 
16 
17 asyncio.run(main())

Embeddings

1 response = client.embeddings.create(
2     model="BAAI/bge-base-en-v1.5",
3     input=[
4         "Respan traces Together AI chat calls.",
5         "Embeddings should not export vector payloads.",
6     ],
7 )
8 print(len(response.data))

Rerank

1 response = client.rerank.create(
2     model="Salesforce/Llama-Rank-v1",
3     query="Which document is about observability?",
4     documents=[
5         "Distributed tracing shows how requests move through services.",
6         "Sourdough bread needs flour, water, salt, and patience.",
7     ],
8     top_n=1,
9     return_documents=True,
10 )
11 print(response.results[0].index)

Image generation

1 response = client.images.generate(
2     model="black-forest-labs/FLUX.1-schnell-Free",
3     prompt="A small line-art observability dashboard icon",
4     n=1,
5     width=256,
6     height=256,
7     steps=4,
8     response_format="url",
9 )
10 print(len(response.data))

Tool calling

1 response = client.chat.completions.create(
2     model="meta-llama/Llama-3.2-3B-Instruct-Turbo",
3     messages=[{"role": "user", "content": "What is the weather in Tokyo?"}],
4     tools=[
5         {
6             "type": "function",
7             "function": {
8                 "name": "get_weather",
9                 "description": "Get the current weather for a city.",
10                 "parameters": {
11                     "type": "object",
12                     "properties": {"city": {"type": "string"}},
13                     "required": ["city"],
14                 },
15             },
16         }
17     ],
18 )
19 print(response.choices[0].message.tool_calls)