For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
DiscordPlatform
DocumentationIntegrationsAPI referenceSDKsChangelog
DocumentationIntegrationsAPI referenceSDKsChangelog
  • Integrations
    • Overview
      • OpenAI SDK
      • Instructor
      • Anthropic SDK
      • Google GenAI
      • LiteLLM
      • RubyLLM
      • Vertex AI
      • AWS Bedrock
      • Ollama
      • Watsonx
      • Together AI
      • Aleph Alpha
      • HuggingFace
      • Replicate
      • SageMaker
      • Respan API
LogoLogo
DiscordPlatform
On this page
  • Setup
  • Configuration
  • Attributes
  • In Respan()
  • With propagate_attributes
  • Setup
  • Switch models
IntegrationsLLM SDKs

Replicate

Was this page helpful?
Previous

Amazon SageMaker

Next
Built with

Replicate is a platform for running machine learning models in the cloud. It hosts thousands of open-source models and provides a simple API for running predictions without managing infrastructure. Respan gives you full observability over every prediction, input, and output — and gateway routing through the OpenAI-compatible Respan endpoint.

Set up Respan

Create an account at platform.respan.ai and grab an API key. For gateway, also add credits or a provider key.

Run npx @respan/cli setup to set up with your coding agent.

Example projects
  • Python examples
Tracing
Gateway

Setup

1

Install packages

$pip install respan-ai opentelemetry-instrumentation-replicate replicate
2

Set environment variables

$export REPLICATE_API_TOKEN="YOUR_REPLICATE_API_TOKEN"
$export RESPAN_API_KEY="YOUR_RESPAN_API_KEY"

REPLICATE_API_TOKEN is used for Replicate predictions. RESPAN_API_KEY is used to export traces to Respan.

3

Initialize and run

1import replicate
2from respan import Respan
3from opentelemetry.instrumentation.replicate import ReplicateInstrumentor
4
5respan = Respan(instrumentations=[ReplicateInstrumentor()])
6
7output = replicate.run(
8 "meta/llama-2-70b-chat",
9 input={"prompt": "Say hello in three languages."},
10)
11print("".join(output))
12respan.flush()
4

View your trace

Open the Traces page to see your auto-instrumented prediction spans with model, inputs, outputs, and latency.

Configuration

ParameterTypeDefaultDescription
api_keystr | NoneNoneFalls back to RESPAN_API_KEY env var.
base_urlstr | NoneNoneFalls back to RESPAN_BASE_URL env var.
instrumentationslist[]Plugin instrumentations to activate (e.g. ReplicateInstrumentor()).
customer_identifierstr | NoneNoneDefault customer identifier for all spans.
metadatadict | NoneNoneDefault metadata attached to all spans.
environmentstr | NoneNoneEnvironment tag (e.g. "production").

Attributes

In Respan()

1from respan import Respan
2from opentelemetry.instrumentation.replicate import ReplicateInstrumentor
3
4respan = Respan(
5 instrumentations=[ReplicateInstrumentor()],
6 customer_identifier="user_123",
7 metadata={"service": "replicate-api", "version": "1.0.0"},
8)

With propagate_attributes

1import replicate
2from respan import Respan, propagate_attributes
3from opentelemetry.instrumentation.replicate import ReplicateInstrumentor
4
5respan = Respan(instrumentations=[ReplicateInstrumentor()])
6
7def handle_request(user_id: str, prompt: str):
8 with propagate_attributes(
9 customer_identifier=user_id,
10 thread_identifier="conv_abc_123",
11 metadata={"plan": "pro"},
12 ):
13 output = replicate.run("meta/llama-2-70b-chat", input={"prompt": prompt})
14 print("".join(output))
AttributeTypeDescription
customer_identifierstrIdentifies the end user in Respan analytics.
thread_identifierstrGroups related messages into a conversation.
metadatadictCustom key-value pairs. Merged with default metadata.