How much does the OpenAI API cost per 1,000 tokens?

OpenAI quotes prices per 1,000,000 tokens, so divide by 1,000 for a per-1,000 figure. GPT-4o is $2.50 input / $10.00 output per 1M, i.e. $0.0025 input / $0.010 output per 1,000 tokens. GPT-4o mini is far cheaper at $0.00015 / $0.0006 per 1,000. Input and output are billed separately — output is the dominant cost.

How do I calculate the cost of a GPT-4 API call?

Multiply input tokens by the input price and output tokens by the output price, each divided by 1,000,000. For a GPT-4o call with 1,000 input + 500 output tokens: (1,000 ÷ 1e6 × $2.50) + (500 ÷ 1e6 × $10.00) = $0.0025 + $0.005 = $0.0075 per call. This tool does that for every model and multiplies by your monthly request volume.

Is the Claude API cheaper than GPT?

It depends on the tier. Claude Haiku 4.5 ($1 / $5 per 1M) is cheaper than GPT-4o ($2.50 / $10) and close to GPT-4o mini on output. Claude Sonnet 4.6 ($3 / $15) is pricier than GPT-4o on output but often fewer retries. The comparison table here sorts every model cheapest-first for your exact token mix, so you can see which wins for your workload.

How much does it cost to run an LLM app per month?

Multiply per-request cost by requests per month. A support bot doing 10,000 requests/month at 1,000 input + 500 output tokens costs about $105/month (≈ Rs 31,500) on Claude Sonnet 4.6, or a fraction of that on Haiku 4.5 or Gemini Flash. Enter your real numbers above to see the monthly figure in USD and LKR for every model.

How much can I save with the batch API or prompt caching?

The batch (asynchronous) API on OpenAI, Anthropic, and Google discounts both input and output by 50% — best for non-real-time jobs. Prompt caching bills the reused prompt prefix at a fraction of input price: Anthropic cache-read is 0.1×, OpenAI cached input is 0.25–0.5×, Google context cache ≈ 0.25×. Toggle both above to see the effect on your bill.

Why are output tokens more expensive than input?

The model generates output tokens one at a time, each requiring a full forward pass, while input tokens are read in a single batched pass. Every model in this table prices output 3–5× higher than input. That is why output length, not prompt length, usually drives your bill — and why trimming verbose completions saves more than trimming prompts.

Are these prices live or up to date?

No — this is a calculator over a static price table, not a live feed. Prices are vendor list prices snapshotted on 2026-06-30. Always confirm against the official pricing page before committing a budget; vendors change tiers without notice. Links to every source are listed below.

How do I convert the cost to Sri Lankan rupees?

The tool multiplies the USD total by a USD→LKR rate you can edit (pre-filled at Rs 300, a CBSL indicative value). For an accurate LKR figure, set the rate to today's CBSL indicative rate or your card provider's rate, which usually adds a 1–3% margin on top.

Does this include fine-tuning, image, or audio pricing?

No. This v1 covers standard text input/output chat-completion pricing only. Fine-tuning, image generation, audio, embeddings, and realtime have separate per-unit prices and their own dedicated calculators. For exact byte-pair token counts rather than estimates, use a dedicated token counter.

Developers · AI

AI API Cost Calculator — OpenAI, Claude & Gemini

Work out the monthly bill before you ship an LLM feature. Pick a model, enter your tokens and request volume, and see the cost per request and per month in USD and Sri Lankan rupees — with a cheapest-first table of every model and optional batch and cache discounts. No signup, sources cited below.

By Induwara Ashinsana— Executive Director, Ryzera TechnologiesUpdated Jun 30, 2026

Estimate your monthly API billUSD & LKR

List prices · verified Jun 2026

Model

Prices shown are input/output USD per 1M tokens.

Requests per month

req

How many API calls your app makes in a month.

Input tokens per request

tok

Average prompt size (system + user + context).

Output tokens per request

tok

Average completion length the model returns.

USD → LKR rate

CBSL indicative rate — edit to today's value.

Monthly cost (USD)

$105.00

Claude Sonnet 4.6

Monthly cost (LKR)

Rs 31,500

Per 1,000 requests

$10.50

$0.0105 per request

Blended rate

$7.00/1M

Across this input/output mix

Where the money goes

Input tokens$30.0028.57%

Output tokens$75.0071.43%

Monthly total$105.00

Output tokens usually dominate the bill — they are priced 3–5× higher than input on every model here. Cache discounts only touch the input term; the one-time cache-write surcharge is excluded.

Compare every model (cheapest first)

Switching to the cheapest saves $102.50/mo (Rs 30,750)

Model	Monthly USD	Monthly LKR	vs selected
OpenAI	$2.50	Rs 750	-97.62%
Google	$3.00	Rs 900	-97.14%
OpenAI	$4.50	Rs 1,350	-95.71%
Meta (Together AI)	$4.75	Rs 1,425	-95.48%
Mistral	$5.00	Rs 1,500	-95.24%
Meta (Together AI)	$6.95	Rs 2,085	-93.38%
DeepSeek	$8.20	Rs 2,460	-92.19%
OpenAI	$12.50	Rs 3,750	-88.1%
Google	$15.50	Rs 4,650	-85.24%
DeepSeek	$16.45	Rs 4,935	-84.33%
OpenAI	$33.00	Rs 9,900	-68.57%
Anthropic	$35.00	Rs 10,500	-66.67%
Mistral	$50.00	Rs 15,000	-52.38%
OpenAI	$62.50	Rs 18,750	-40.48%
Google	$62.50	Rs 18,750	-40.48%
OpenAI	$75.00	Rs 22,500	-28.57%
Anthropic	$105.00	Rs 31,500	—
Anthropic	$175.00	Rs 52,500	+66.67%
OpenAI	$450.00	Rs 135,000	+328.57%

Tap any row to recompute the headline for that model. Batch and cache toggles apply to every row that supports them.

Prices are vendor list prices snapshotted on 2026-06-30 — not a live feed.

How it works

Large-language-model APIs price input (prompt) tokens and output (completion) tokens separately, quoted in US dollars per 1,000,000 tokens. Output is always more expensive — 3–5× on every model in this table — because the model runs a full forward pass to generate each output token, whereas input tokens are read in one batched pass. That single fact is why your bill is usually driven by how much the model writes, not how much you send.

For a chosen model with input price Pin and output price Pout (USD per 1M tokens), the calculator runs:

Monthly token volumes: inTokens = inputPerReq × requests and outTokens = outputPerReq × requests.
Base cost: inputCost = inTokens ÷ 1e6 × Pin and outputCost = outTokens ÷ 1e6 × Pout.
Cached input (if enabled): the cached fraction cof input tokens is billed at the provider's cache-read multiplier m instead of full price, so inputCost = inTokens ÷ 1e6 × Pin × ((1 − c) + c × m). Anthropic cache-read is 0.1×; OpenAI cached input is 0.25–0.5× by family; Google context-cache read ≈ 0.25×.
Batch (if enabled): batch-eligible providers discount both input and output by 50% (× 0.5), applied after the cache adjustment.
Monthly total = inputCost + outputCost; per request = total ÷ requests; LKR = USD × your editable CBSL rate.

Every per-1M figure in the underlying data module carries an inline source URL, and the math is reconciled against a second independent per-1,000-token formula so the arithmetic can't drift. The comparison table reruns this for all models on every input change and sorts cheapest-first, so a lower-cost substitute is always one glance away.

Worked examples

Support bot on Claude Sonnet 4.6

1,000 in · 500 out · 10,000 req/mo · no discounts

inTokens = 1,000 × 10,000 = 10,000,000 → 10 × $3 = $30.00
outTokens = 500 × 10,000 = 5,000,000 → 5 × $15 = $75.00
Monthly = $30.00 + $75.00 = $105.00
Per request = $0.0105 ; per 1,000 = $10.50
At Rs 300/USD → Rs 31,500.00 / month

High-volume batch job on Claude Haiku 4.5

2,000 in · 800 out · 50,000 req/mo · batch on

inTokens = 2,000 × 50,000 = 100,000,000 → 100 × $1 = $100.00
outTokens = 800 × 50,000 = 40,000,000 → 40 × $5 = $200.00
Subtotal = $300.00 ; batch −50% → $150.00 / month
Per request = $0.003
At Rs 300/USD → Rs 45,000.00 / month

Cached-prompt RAG app on Claude Sonnet 4.6

1,000 in (80% cached) · 500 out · 10,000 req/mo

cacheFactor = (1 − 0.8) + 0.8 × 0.1 = 0.28
inputCost = 10 × $3 × 0.28 = $8.40 (down from $30.00)
outputCost unchanged = $75.00
Monthly = $8.40 + $75.00 = $83.40
Caching cuts the input term ~3.5×; output still dominates.

Frequently asked questions

Sources & references

Prices were last cross-checked against these sources on 2026-06-30. This tool is a calculator over a snapshotted price table, not a live feed — confirm the current rate on the official page before committing a budget.

Related tools

LiveAI

Reasoning Token Cost Calc

Estimate the true cost of reasoning-model API calls by accounting for the hidden reasoning/thinking tokens that o-series, Claude, and Gemini bill at output rates. See per-call and monthly cost in USD and LKR, plus how much more it is than a naive estimate.

Open tool

LiveAI

AI Vision Token Calculator

Calculate how many tokens an image costs on GPT-4o, GPT-4o mini, Claude, and Gemini from its pixel dimensions — plus the per-image and total cost in USD and LKR, side by side. Runs entirely in your browser; the image is never uploaded.

Open tool

LiveAI

Prompt Caching Calculator

Calculate how much prompt caching saves on your LLM API bill. Compare cost with vs without caching, the dollar savings, and the break-even point for Claude, OpenAI, and Gemini using each provider's official cache-write and cache-read multipliers.

Open tool

Rate this tool

Be the first to rate

Comments & feedback

Spotted a bug or want an improvement? Tell us — our team reviews every comment, and good ideas get built. Comments are public and anonymous.

Spotted a stale price, edge case, or want another model added?

Email me at [email protected] — most fixes ship within 24 hours.