AI Chatbot Conversation Cost Calculator
Estimate what a multi-turn AI chatbot really costs to run. Because every turn re-sends the whole conversation, cost grows with the square of conversation length — this tool models that correctly and prices the same workload across Claude, GPT, and Gemini, in dollars and rupees.
How it works
A single API call is easy to price: tokens in × input rate, plus tokens out × output rate. A conversationis the trap. Chat models are stateless, so to remember what was said your app re-sends the system prompt and the entire transcript on every single turn. The cost of one conversation is therefore far higher than the cost of one message — and most online “API cost” calculators ignore this entirely.
Let s be the system-prompt tokens, u the average user-message tokens, a the average AI-reply tokens, and N the number of turns. At turn t the request carries the system prompt, the whole transcript so far, and the new user message:
inputTokens(t) = s + (t − 1)·(u + a) + u
Summing every turn of one conversation gives the closed form the calculator uses:
- total input = N·s + N·u + (u + a)·N·(N − 1)/2
- total output = N·a
The N·(N − 1)/2 term is the quadratic blow-up — the reason a longer conversation costs disproportionately more. The calculator cross-checks this closed form against an explicit turn-by-turn summation so the two methods always agree to the token.
Cost per conversation is then (totalInput/1e6)·inputPrice + (totalOutput/1e6)·outputPrice, and the monthly bill is that figure times your conversation volume. Rupee figures multiply the dollar cost by an editable CBSL indicative exchange rate.
Prompt caching(the toggle) re-prices the stable prefix. The system prompt and prior transcript are served as cache reads at roughly 10% of the input price; only each turn's new user message is billed at full price, plus a one-time cache write at 1.25× input for newly added content (Anthropic's published model — OpenAI and Gemini auto-cache reads at their own published cached-input rate with no separate write fee). It is an estimate: real savings depend on the 5-minute cache window and how quickly users reply. Claude rates are authoritative; GPT and Gemini rates are transcribed from the official pricing pages and dated below.
Worked examples
Shared workload: a 500-token system prompt, 80-token user messages, 200-token replies, 8 turns per conversation, 3,000 conversations a month, at Rs 305 per US dollar.
Frequently asked questions
Sources & references
- Anthropic — Claude API pricing (authoritative for Claude rates)
- Anthropic — Prompt caching (cache read 0.1× / write 1.25× of input)
- OpenAI — API pricing (GPT models)
- Google — Gemini API pricing
- Central Bank of Sri Lanka — indicative exchange rates (USD→LKR)
Claude rates are authoritative. GPT and Gemini rates were transcribed from the official pricing pages above and last verified on 2026-06-05; they are reviewed each quarter and whenever a provider announces a price change. The tool runs entirely in your browser — no inputs leave your device.
Related tools
Comments & feedback
Spotted a bug or want an improvement? Tell us — our team reviews every comment, and good ideas get built. Comments are public and anonymous.
Found a bug, a pricing change, or want another provider added?
Email me at [email protected] — most fixes ship within 24 hours.