List latency / TTFT / TPS quantiles
List latency / TTFT / TPS quantiles
Returns p50, p90, p95, and p99 distributions for latency, time-to-first-token, and tokens-per-second, bucketed by time_tick.
Authentication
AuthorizationBearer
Use your Respan API key for Respan API authentication. Enter only the Respan API key value; clients send Authorization: Bearer <RESPAN_API_KEY>. For /api/responses, OpenAI or Azure OpenAI provider credentials go in Settings -> Providers or the request body credential_override field, not in this auth field.
Query parameters
summary_type
Preset time range. Use this or explicit start_time / end_time.
date
Base date used with summary_type presets.
start_time
Optional explicit ISO start time.
end_time
Optional explicit ISO end time.
time_tick
Bucket granularity for time-series responses.
Allowed values:
timezone_offset
Timezone offset, in hours, used when resolving preset ranges.
Request
This endpoint expects an object.
filters
Filter criteria. See Filters API Reference for operator syntax.
metrics_to_aggregate
Metric subset for quantile aggregation.
Response
Successful response.
date_group
Bucket start time, ISO 8601.
latency_p_50
latency_p_90
latency_p_95
latency_p_99
ttft_p_50
ttft_p_90
ttft_p_95
ttft_p_99
tps_p_50
tps_p_90
tps_p_95
tps_p_99