Prompt caching
Use Anthropic prompt caching through the Respan gateway.
Prompt caching stores the model’s intermediate computation state. The model generates diverse responses while saving computational costs, as it doesn’t need to reprocess the entire prompt from scratch.
Only available for Anthropic models through the gateway.
For Respan-managed response caching (storing and reusing exact request/response pairs), see Respan caching.