Are reasoning tokens billed as input or output tokens?

As output tokens. OpenAI (o-series), Anthropic (Claude extended/adaptive thinking), and Google (Gemini thinking) all bill the hidden reasoning tokens at their model's output rate — the higher of the two rates. That is why a reasoning call costs far more than the prompt and visible answer alone suggest.

Why is my o1 or o3 API bill so much higher than expected?

Because each call silently generates reasoning tokens you never see, and they are charged at the output rate. A request with a 1,500-token prompt and a 500-token answer can burn 25,000 reasoning tokens at high effort — turning a ~$0.05 naive estimate into over $1.50 per call. Set effort and reasoning tokens in this calculator to see the real figure.

Do you get charged for hidden reasoning tokens you can't see?

Yes. The provider discards the reasoning trace from the response (OpenAI summarises it; Anthropic returns thinking blocks), but the tokens are still counted and billed. OpenAI reports them in usage.completion_tokens_details.reasoning_tokens; with Anthropic they are part of output_tokens.

How many reasoning tokens does high effort use versus low effort?

It varies by task, but as a planning rule this tool seeds Low ≈ 1,000, Medium ≈ 4,000, High ≈ 12,000, and Max ≈ 25,000 reasoning tokens. Hard maths, multi-step agents, and long code refactors push toward the high end. These are estimates — always confirm against your real usage value.

Are Claude extended thinking tokens billed at the output rate?

Yes. Anthropic's extended and adaptive thinking tokens are part of the output token count and billed at the model's output rate (for Claude Opus 4.8 that is $25 per million). To find the reasoning portion, subtract your visible answer tokens from the reported output_tokens.

How do I find my real reasoning token count?

With OpenAI, read usage.completion_tokens_details.reasoning_tokens from the API response. With Anthropic, the thinking tokens are inside output_tokens, so subtract the visible answer length. With Gemini, the thinking output is included in the response usage metadata. Paste that number into the Reasoning tokens field for an exact cost.

How is the cost in Sri Lanka Rupees calculated?

The tool prices everything in USD, then multiplies by an editable USD→LKR rate (default 305) for the rupee figures. Set the rate to today's mid-market rate from the Central Bank of Sri Lanka for an exact conversion.

AI · Developer tools

AI Reasoning Token Cost Calculator

Find the real cost of a reasoning-model API call. This tool adds the hidden reasoning (“thinking”) tokens that o-series, Claude, and Gemini bill at the output rate — so you see the true per-call and monthly bill, and how much more it is than a naive token estimate. No signup, sources cited below.

By Induwara Ashinsana— Executive Director, Ryzera TechnologiesUpdated Jun 6, 2026

Reasoning token cost

Anthropic-verified rate

Provider & model

Currency

Reasoning effort

Input (prompt) tokens

Your prompt, system message, tools, and context.

Visible output tokens

The answer you actually see in the response.

Reasoning tokens

Estimate from "Medium" effort — edit to your real value.

Calls per month

How many of these calls you make in a month.

Example workloads

True cost / call

$0.135

incl. 4,000 reasoning tokens

Naive estimate / call

$0.035

ignores reasoning tokens

Vs naive estimate

3.9×

more than a plain token counter

Monthly cost

$135.00

1,000 calls/mo

What you pay for per call

Input: $0.017%
Reasoning: $0.1074%
Visible output: $0.02519%

Per-call cost breakdown

Component	Tokens	Rate	Cost
Input tokens	2,000	$5/1M	$0.01
Reasoning tokens (billed as output)	4,000	$25/1M	$0.10
Visible output tokens	1,000	$25/1M	$0.025
True cost per call			$0.135

Reasoning tokens add $100.00 per month at 1,000 calls — money a naive estimate never shows. Excludes prompt caching, batch discounts, and per-tenant contract pricing.

Cost curve of thinking harder

Effort	Reasoning tokens	Cost / call	Monthly
Low	1,000	$0.06	$60.00
Mediumselected	4,000	$0.135	$135.00
High	12,000	$0.335	$335.00
Max	25,000	$0.66	$660.00

Same prompt and visible output; only the reasoning-token estimate changes. These are planning estimates — read your real reasoning_tokens usage value for exact figures.

Reasoning (“thinking”) tokens are billed at the model's output rate by Anthropic, OpenAI, and Google. Anthropic rates are cross-checked against the Anthropic pricing page; OpenAI and Gemini rows are published list prices that change without notice. Rates last verified 2026-06-06. Full sources are listed below the calculator.

How it works

Reasoning models think before they answer. That thinking is a stream of reasoning tokens (OpenAI's term), thinking tokens (Anthropic, Google) that the model generates internally. You usually never see them — the API discards or summarises the trace — but every provider counts and bills them, and crucially they are billed at the model's output rate, not the cheaper input rate.

A naive cost estimate prices only what you can point at: the prompt you sent and the answer you received. The real bill adds a third, invisible line. The calculator above uses this model:

Convert the published per-million rates to per-token prices: Pin = input$/1M ÷ 1,000,000 and Pout = output$/1M ÷ 1,000,000.
inputCost = inputTokens × Pin
reasonCost = reasoningTokens × Pout — the hidden cost.
outputCost = visibleTokens × Pout
costPerCall = inputCost + reasonCost + outputCost, while naivePerCall = inputCost + outputCost. The multiplier is their ratio, and the monthly figure is costPerCall × callsPerMonth.

Because reasoning tokens and visible tokens are both billed at Pout, the per-call cost can also be written inputCost + (reasoning + visible) × Pout. The tool computes the cost both ways and they agree to the cent — a built-in cross-check on the math. Effort levels (Low, Medium, High, Max) seed a typical reasoning-token estimate; reasoning counts are non-deterministic, so the field is editable and the methodology is a planning aid, not a guarantee. For an exact figure, read your real reasoning_tokens value from the API response.

Worked examples

Claude Opus 4.8 — coding agent, adaptive thinking

Input $5/1M, output $25/1M · I=2,000 R=8,000 V=1,000 · 1,000 calls/mo

inputCost = 2,000 × $0.000005 = $0.010
reasonCost = 8,000 × $0.000025 = $0.200
outputCost = 1,000 × $0.000025 = $0.025
costPerCall = $0.235 vs naive = $0.035
multiplier = 0.235 ÷ 0.035 = 6.7×
monthly = $0.235 × 1,000 = $235 (≈ Rs 71,675 at 305)

OpenAI o1 — high reasoning effort

Input $15/1M, output $60/1M · I=1,500 R=25,000 V=500 · 1,000 calls/mo

inputCost = 1,500 × $0.000015 = $0.0225
reasonCost = 25,000 × $0.000060 = $1.500
outputCost = 500 × $0.000060 = $0.030
costPerCall = $1.5525 vs naive = $0.0525
multiplier = 1.5525 ÷ 0.0525 = 29.6×
monthly = $1.5525 × 1,000 = $1,552.50 — the '30× my estimate' case

Edge case — reasoning with no visible output

Claude Sonnet 4.6, output $15/1M · I=0 R=5,000 V=0 · 1 call

inputCost = 0
reasonCost = 5,000 × $0.000015 = $0.075
outputCost = 0
costPerCall = $0.075 vs naive = $0.000
A naive counter shows $0, but you are billed $0.075.
The multiplier is infinite, so the tool shows 'n/a'.

Frequently asked questions

Sources & references

Anthropic rates were cross-checked against the Anthropic pricing page on 2026-06-06. OpenAI and Gemini rows are published list prices that change without notice — re-check them against the linked source before relying on a figure. No live API calls are made; pricing is a static, dated table.

Related tools

LiveAI

AI Chatbot Cost Calculator

Estimate the monthly API cost of a multi-turn AI chatbot across Claude, GPT, and Gemini. Models the quadratic context re-sending that single-call calculators miss, with and without prompt caching, in USD and LKR.

Open tool

LiveAI

AI API Cost Calculator

Estimate the monthly and per-request USD and LKR bill for any major LLM API. Pick a model, enter input/output tokens and requests per month, and compare every model cheapest-first — with optional 50% batch and cached-input discounts.

Open tool

LiveAI

AI Agent Cost Calculator

Estimate the real per-run, daily, and monthly cost of a multi-step LLM agent across Claude, GPT, and Gemini. Models the context accumulation single-call calculators miss — each tool result is re-sent every step — with caching, in USD and LKR.

Open tool

Rate this tool

Be the first to rate

Comments & feedback

Spotted a bug or want an improvement? Tell us — our team reviews every comment, and good ideas get built. Comments are public and anonymous.

Found a bug, edge case, or want to suggest an improvement?

Email me at [email protected] — most fixes ship within 24 hours.

What you pay for per call

Per-call cost breakdown

Cost curve of thinking harder

How it works

Worked examples

Frequently asked questions

Are reasoning tokens billed as input or output tokens?

Why is my o1 or o3 API bill so much higher than expected?

Do you get charged for hidden reasoning tokens you can't see?

How many reasoning tokens does high effort use versus low effort?

Are Claude extended thinking tokens billed at the output rate?

How do I find my real reasoning token count?

Does this calculator make live API calls or use live prices?

How is the cost in Sri Lanka Rupees calculated?

Sources & references

Related tools

AI Chatbot Cost Calculator

AI API Cost Calculator

AI Agent Cost Calculator

Comments & feedback