Amazon SageMaker

Trace Amazon SageMaker endpoint invocations with Respan.
  1. Sign up — Create an account at platform.respan.ai
  2. Create an API key — Generate one on the API keys page
  3. Add credits or a provider key — Add credits on the Credits page or connect your own provider key on the Integrations page

Add the Docs MCP to your AI coding tool to get help building with Respan. No API key needed.

1{
2 "mcpServers": {
3 "respan-docs": {
4 "url": "https://docs.respan.ai/mcp"
5 }
6 }
7}

What is Amazon SageMaker?

Amazon SageMaker is AWS’s fully managed machine learning platform. It provides tools for building, training, and deploying ML models at scale, including real-time inference endpoints for LLMs and other models.

Setup

1

Install packages

$pip install respan-ai opentelemetry-instrumentation-sagemaker boto3 python-dotenv
2

Set environment variables

$export AWS_ACCESS_KEY_ID="YOUR_AWS_ACCESS_KEY_ID"
$export AWS_SECRET_ACCESS_KEY="YOUR_AWS_SECRET_ACCESS_KEY"
$export AWS_REGION="us-east-1"
$export RESPAN_API_KEY="YOUR_RESPAN_API_KEY"
$export OTEL_EXPORTER_OTLP_ENDPOINT="https://api.respan.ai/api"
$export OTEL_EXPORTER_OTLP_HEADERS="Authorization=Bearer $RESPAN_API_KEY"
3

Initialize and run

1import os
2import json
3from dotenv import load_dotenv
4
5load_dotenv()
6
7import boto3
8from respan import Respan
9
10# Auto-discover and activate all installed instrumentors
11respan = Respan(is_auto_instrument=True)
12
13# Invoke a SageMaker endpoint — auto-traced by Respan
14client = boto3.client("sagemaker-runtime", region_name=os.getenv("AWS_REGION"))
15
16payload = json.dumps({"inputs": "Say hello in three languages."})
17
18response = client.invoke_endpoint(
19 EndpointName="my-llm-endpoint",
20 ContentType="application/json",
21 Body=payload,
22)
23result = json.loads(response["Body"].read().decode())
24print(result)
25respan.flush()
4

View your trace

Open the Traces page to see your auto-instrumented endpoint invocation spans.

What gets traced

  • Endpoint name and model
  • Input/output payloads
  • Invocation latency
  • Token usage (for language model endpoints)
  • Error responses

Traces appear in the Traces dashboard.

Learn more