Limits
Set cost, request, and token limits per API key to control spending and usage.
Limits let you cap spending, request volume, or token usage per API key. Set thresholds that warn you or block requests when exceeded.
Limit types
Threshold modes
Each limit policy supports two modes that can be stacked:
Warn — when usage hits the threshold, you get a notification but requests keep flowing. Use this for soft budgets where you want visibility without disrupting production.
Block — when usage hits the threshold, the gateway rejects further requests with a 429 status. Use this for hard spending caps.
You can combine both on a single policy. For example, warn at 100/hour.
Set up limits
Configure thresholds
Set the time period (hour, day, or month) and define your warn and/or block thresholds.
Notifications
Get alerted when limits are triggered:
- Slack — send alerts to a channel via webhook URL
- Email — notify team members directly