Rate Limits
Monthly limits · Per-minute limits · Quota rules
MAIG enforces two independent layers of limits: a monthly request quota that resets each billing cycle, and a per-minute rate limit that prevents burst traffic from overwhelming providers. Additional per-user and per-route quota rules are available on Pro and Business plans.
Monthly Request Limits
Each plan includes a monthly request allowance. The counter resets at the start of each billing cycle.
| Plan | Monthly Limit | What happens when exceeded |
|---|---|---|
| Free | 2,500 requests | Hard stop — all subsequent requests return 429 until the next billing cycle begins. |
| Starter | 15,000 requests | Hard stop — all subsequent requests return 429 until the next billing cycle begins. |
| Pro | 60,000 requests | Overage billing — requests beyond the included allowance are billed at the per-request overage rate for your plan. |
| Business | 300,000 requests | Overage billing — requests beyond the included allowance are billed at the per-request overage rate for your plan. |
Per-Minute Rate Limiting
All plans enforce a global rate limit of 100 requests per minute per API key. This limit applies across all routes in a project. Requests that exceed this limit receive an immediate 429 response and are not queued.
Per-route rate limiting (Pro+)
On Pro and Business plans, you can configure independent rate limits per route in the dashboard. Per-route limits support both minute and hour windows and can be set to any value up to your plan's global limit. This is useful when you want to allocate different throughput budgets to different features or route tiers.
Per-route limits are configured in Dashboard → Routes → [select route] → Rate Limiting.
Quota Rules (Pro+)
Quota rules let you cap request volume for individual end-users or for specific routes, independently from the global plan limits. Quota rules are available on Pro and Business plans.
Per-user quotas
Set a maximum number of requests per day, week, or month for each unique user value passed in the request body. Use case: limit each of your end-users to a fixed number of AI requests per day, preventing a single user from consuming a disproportionate share of your monthly allowance.
Per-route quotas
Apply a daily, weekly, or monthly request cap to a specific route, regardless of which user made the request. Use case: protect a high-cost route (e.g. one backed by an expensive model) from unbounded traffic without rate-limiting your entire project.
Quota rules are configured in Dashboard → Routes → [select route] → Quotas. When a quota is exceeded, the gateway returns a 429 response with a message indicating which quota was triggered.
Rate Limit Response
When any rate limit or quota is exceeded, the gateway responds with HTTP 429 and the following JSON body:
{
"error": {
"message": "Rate limit exceeded",
"type": "rate_limit_error",
"code": 429
}
}
Check the Retry-After response header when it is present — it indicates the number of seconds to wait before retrying.
Best Practices
- Use MAIG's built-in retry — per-route retry can be enabled in the dashboard. The gateway performs exponential backoff with up to 5 retries before returning a
502. This handles transient provider errors without any client-side logic. - Implement client-side backoff — when your client receives a
429, wait before retrying. A simple strategy: wait2^attemptseconds (1s, 2s, 4s, 8s) with a small random jitter to avoid thundering herd. - Monitor your usage — the dashboard's Usage tab shows request counts by day and by route. Set up email alerts in dashboard settings to be notified before you hit your monthly limit.
- Upgrade if limits are consistently hit — if your application regularly hits the monthly limit, upgrading to a higher plan or enabling overage billing (Pro+) will prevent hard stops from affecting your users.
- Use per-user quotas to protect your plan — if you expose AI features to end-users, configure per-user quotas so a single user cannot exhaust your monthly allowance.