For AI agents: a documentation index is available at the root level at /llms.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Create a request-log span via the logging API. This is the standard create endpoint; `/api/request-logs/create/` remains supported as a legacy alias. For LLM request logs, send `prompt_messages`, `completion_message`, token counts, timing, metadata, tools, and trace fields directly in the body. `generation_time` is accepted and normalized to `latency`; `ttft` is accepted and normalized to `time_to_first_token`. The stored `environment` is derived from the API key environment, so use a key for the target environment rather than relying on a body override.
Authentication
AuthorizationBearer
Use your Respan API key for Respan API authentication. Enter only the Respan API key value; clients send Authorization: Bearer <RESPAN_API_KEY>. For /api/responses, OpenAI or Azure OpenAI provider credentials go in Settings -> Providers or the request body credential_override field, not in this auth field.
Request
This endpoint expects an object.
modelstringRequired
Model used for the span.
prompt_messageslist of objectsOptional
Chat input messages for the request log.
completion_messageobjectOptional
Assistant message returned by the model.
prompt_tokensintegerOptional
Prompt/input tokens for the request. For Anthropic logs this corresponds to input_tokens before cache-token normalization.
completion_tokensintegerOptional
Completion/output tokens for the request. For Anthropic logs this corresponds to output_tokens.
usageobjectOptional
Provider usage object. Cache token fields such as cache_creation_input_tokens and cache_read_input_tokens are accepted and normalized into Respan cache-token counters.
temperaturedoubleOptionalDefaults to 1
Sampling temperature (0-2). Higher = more random.
top_pdoubleOptionalDefaults to 1
Nucleus sampling parameter.
max_tokensintegerOptional
Maximum tokens to generate.
generation_timedoubleOptional
Accepted alias for total generation latency in seconds. Stored as latency in responses and query results.
ttftdoubleOptional
Accepted alias for time to first token in seconds. Stored as time_to_first_token in responses and query results.
customer_paramsobjectOptional
Extended customer information. customer_identifier inside this object is promoted to the log customer identifier.
metadataobjectOptional
Arbitrary key-value pairs for your reference.
streambooleanOptionalDefaults to false
Whether the response was streamed.
status_codeintegerOptionalDefaults to 200
HTTP status code of the request.
toolslist of objectsOptional
Tools available to the model (OpenAI function calling format).
tool_callslist of objectsOptional
Tool calls returned by the model.
timestampdatetimeOptional
ISO 8601 timestamp when the request completed.
trace_unique_idstringOptional
Trace ID to link spans into a trace tree.
span_namestringOptional
Name of this span within the workflow.
span_parent_idstringOptional
Parent span ID. Builds the trace hierarchy.
span_workflow_namestringOptional
Name of the parent workflow.
custom_identifierstringOptional
Indexed custom identifier for fast querying.
thread_identifierstringOptional
Conversation thread ID for multi-turn conversations.
group_identifierstringOptional
Groups related spans together.
latencydoubleOptional
Total request latency in seconds. generation_time is also accepted and normalizes to this field.
time_to_first_tokendoubleOptional
Time to first token in seconds. ttft is also accepted and normalizes to this field.
log_typeenumOptionalDefaults to chat
Type of span. Determines how input and output are parsed.
inputstring or map from strings to any or list of objectsOptional
Preferred universal input field. For chat spans, send an array of message objects or a JSON string. For non-chat spans, send any string/object/array structure that represents the span input.
outputstring or map from strings to any or list of objectsOptional
Preferred universal output field. For chat spans, send an assistant message object or a JSON string. For non-chat spans, send any string/object/array structure that represents the span output.
messageslist of objectsOptional
Legacy chat input field. Equivalent to prompt_messages; prefer input for new integrations.
costdoubleOptional
Cost in USD. Auto-calculated from model pricing if omitted.
tokens_per_seconddoubleOptional
Generation speed in tokens per second.
propertiesobjectOptional
Typed metadata that preserves native JSON types.
variablesobjectOptional
Variables used for prompt templates.
customer_identifierstringOptional<=254 characters
Identifier for the end user who made this request.
tool_choicestring or map from strings to anyOptional
Controls tool selection. "none", "auto", or a specific tool object.
response_formatobjectOptional
Response format configuration (e.g. JSON mode or structured output).
frequency_penaltydoubleOptional
Penalizes repeated tokens (-2 to 2).
presence_penaltydoubleOptional
Penalizes tokens already present (-2 to 2).
stopstring or list of stringsOptional
Stop sequence or sequences where generation halts.
error_messagestringOptional
Error message if the request failed.
warningsstring or map from strings to anyOptional
Warnings from the request.
statusenumOptional
Request status.
Allowed values:
prompt_idstringOptional
ID of the Respan prompt template used.
prompt_namestringOptional
Name of the prompt template.
is_custom_promptbooleanOptionalDefaults to false
Set true when using a custom prompt_id.
start_timedatetimeOptional
ISO 8601 timestamp when the request started.
full_requestobjectOptional
Full raw request object for reference.
full_responseobjectOptional
Full raw response object from the provider.
prompt_unit_pricedoubleOptional
Custom price per 1M prompt tokens (for self-hosted/fine-tuned models).
completion_unit_pricedoubleOptional
Custom price per 1M completion tokens (for self-hosted/fine-tuned models).
respan_paramsobjectOptional
Preferred namespace for Respan-specific controls such as customer tagging, metadata, prompt loading, cache settings, and logging flags.
keywordsai_paramsobjectOptional
Legacy alias for respan_params. Still accepted and merged into respan_params.
positive_feedbackbooleanOptional
User feedback. true = positive, false = negative.
Response
Span created successfully
idstring
Unique identifier for the span. Alias for unique_id.
unique_idstring
Full unique identifier for the created span.
organization_idstring
Organization identifier associated with the span.
customer_identifierstring
Customer identifier associated with the span.
statusenum
Request status.
costdouble
Computed or supplied request cost in USD.
timestampdatetime
Timestamp when the span was recorded.
environmentstring
Environment derived from the API key used for the log.
latencydouble
Stored total latency in seconds.
time_to_first_tokendouble
Stored time to first token in seconds.
prompt_cache_creation_tokensinteger
Cache creation tokens normalized from usage.
prompt_cache_hit_tokensinteger
Cache read/hit tokens normalized from usage.
Errors
400
Bad Request Error
401
Unauthorized Error
422
Unprocessable Entity Error
429
Too Many Requests Error
500
Internal Server Error
Create a request-log span via the logging API. This is the standard create endpoint; /api/request-logs/create/ remains supported as a legacy alias. For LLM request logs, send prompt_messages, completion_message, token counts, timing, metadata, tools, and trace fields directly in the body. generation_time is accepted and normalized to latency; ttft is accepted and normalized to time_to_first_token. The stored environment is derived from the API key environment, so use a key for the target environment rather than relying on a body override.