AI API Rate Limit Calculator
Will your workload hit a 429? Pick a provider, tier, and model, enter your tokens per request, and get your real maximum requests per minute — the lower of the RPM and token caps — which limit binds, and how long a batch will take. OpenAI, Claude, and Gemini.
How it works
Every LLM provider enforces more than one rate limit at once, and a request is rejected with HTTP 429 the moment it would cross any of them. Your real ceiling is therefore not the headline requests-per-minute (RPM) number — it is the lower of the request cap and the token cap, once you account for how many tokens each request actually carries.
Let i be the average input tokens per request, o the average output tokens, and t = i + o the total. The calculator reads the published caps for your provider, tier, and model and applies the documented logic:
- OpenAI / Gemini: tokenRpm = floor(TPM / t)
- Anthropic: inputRpm = floor(ITPM / i)
- Anthropic: outputRpm = floor(OTPM / o)
OpenAI and Gemini meter a single combined tokens-per-minute (TPM) pool, so the token-bound rate is the TPM divided by total tokens. Anthropic is different: it meters input and output tokens in separatebuckets — input-tokens-per-minute (ITPM) and output-tokens-per-minute (OTPM) — and the output bucket is much smaller. That is why a high-output Claude job throttles long before a naive “TPM ÷ total tokens” guess predicts.
The effective ceiling is then the smallest applicable cap:
- OpenAI / Gemini: rpmEff = min(RPM, tokenRpm)
- Anthropic: rpmEff = min(RPM, inputRpm, outputRpm)
The term that produced the minimum is the binding limit reported back to you. From there, a batch of N requests needs ceil(N / rpmEff) minutes of paced sending, rendered as days, hours, and minutes. A throughput target simply checks whether your desired requests-per-minute sits under the ceiling, with the headroom or overage. Where a provider also publishes a requests-per-day (RPD) cap — common on Gemini free tiers — the tool surfaces it, because RPD binds regardless of how slowly you pace. Each effective ceiling is cross-checked by an independent feasibility test: it must be achievable at that rate and impossible one request higher.
Worked examples
Frequently asked questions
Sources & references
- OpenAI — Rate limits guide (RPM / TPM / RPD per usage tier)
- OpenAI — Usage tiers and qualification thresholds
- Anthropic — Rate limits (RPM plus separate ITPM and OTPM buckets)
- Google — Gemini API rate limits (RPM, TPM, and RPD per tier)
The limit tables were transcribed from the official documentation above and last verified on 2026-06-05. They are reviewed each quarter and whenever a provider announces a change. Providers grant per-account overrides on request, so your dashboard is always the final word. The tool runs entirely in your browser — no inputs leave your device.
Related tools
Comments & feedback
Spotted a bug or want an improvement? Tell us — our team reviews every comment, and good ideas get built. Comments are public and anonymous.
Found a stale limit, a bug, or want another provider added?
Email me at [email protected] — most fixes ship within 24 hours.