How much does the OpenAI Realtime API cost per minute?

On gpt-realtime, audio input is $32 and audio output $64 per 1M tokens. OpenAI bills 1 token per 100 ms of input audio (600 tokens/minute) and 1 token per 50 ms of output audio (1,200 tokens/minute), so one minute of input audio is about $0.0192 and one minute of output audio about $0.0768. A back-and-forth voice minute therefore lands near $0.10 before text and caching. gpt-4o-mini-realtime is roughly a third of that.

Why is audio output more expensive than audio input?

Generating speech is more compute-intensive than transcribing it, and providers price it accordingly. On gpt-realtime audio output is $64/1M versus $32/1M for input — double. Output audio is also tokenised at twice the density (1 token/50 ms vs 1 token/100 ms), so for the same wall-clock seconds the model's reply consumes more tokens and dominates the bill.

What are cached audio tokens and how do they cut cost?

When a session re-sends context the provider has already processed, that audio can be served from cache. OpenAI bills cached audio input at about $0.40/1M instead of $32/1M — roughly a 90% discount on the cached portion. Move the cached-share slider to model it. Gemini's native-audio Live API publishes no cached-audio rate, so this calculator bills that share at the full input rate and flags it.

Is gpt-4o-mini-realtime cheaper than gpt-realtime for voice agents?

Yes — substantially. gpt-4o-mini-realtime prices audio input at $10/1M and output at $20/1M, against $32 and $64 for gpt-realtime. For the same workload it costs roughly a third. The trade-off is reasoning quality and voice fidelity, so it suits short, scripted queries more than open-ended conversation. Use the model-comparison strip to price your exact workload on all three.

How do I estimate the monthly cost of a voice assistant?

Estimate the audio minutes a typical session uses in each direction, the number of sessions per month, and how much of the input is cached. The calculator converts minutes to tokens with each provider's documented rate, prices the four token classes, multiplies by sessions, and projects the year. For an exact figure, switch to token mode and paste the counts from your provider's usage dashboard.

Does this tool make a live call to OpenAI or Google?

No. Rates are pinned constants stored in the page's source with a last-verified date, and every calculation runs in your browser. Nothing — no usage figures, no API key — leaves the page. Because AI prices change, the rates are re-checked against the official pricing pages each quarter; the last pass was on 2026-06-07.

Is the minutes-to-tokens conversion exact?

It uses each provider's documented tokenisation rate (OpenAI: 1 token/100 ms input, 1 token/50 ms output; Gemini: 25 tokens/second), so it is a close estimate, not a per-account guarantee. Real sessions vary with silence, barge-in and context re-sending. If you know your exact token counts, the token-input mode gives a cent-exact figure.

How is this different from a normal LLM API cost calculator?

A text LLM bill has input and output tokens. A speech-to-speech Realtime bill meters four separate classes — audio input, cached audio input, audio output and text input — each on its own price tier, with audio output by far the dearest. Generic calculators that only model text-in and text-out understate a voice-agent bill, often by 5–10×. This tool prices all four classes.

AI · Developer cost

Realtime Voice API Cost Calculator — OpenAI & Gemini Live

Price a speech-to-speech voice agent on the OpenAI Realtime API or Gemini Live API. Models the four billed token classes — audio in, cached audio, audio out, text — and projects per-session, monthly and annual cost in USD and LKR. No signup, rates cited.

By Induwara Ashinsana— Executive Director, Ryzera TechnologiesUpdated Jun 7, 2026

Realtime voice API cost

Provider & model

Audio input / session (minutes)

How long the user speaks per session.

Audio output / session (minutes)

How long the assistant speaks back per session.

Cached audio input share0%

Text input / session (tokens)

System prompt & instruction tokens.

Sessions per month

Calls or conversations per month.

USD → LKR rate

CBSL indicative. Edit to match your bank.

Workload presets

Per session

$0.2152

Rs 65

Monthly (1,000 sessions)

$215.20

Rs 64,560

Annual projection

$2,582.40

Rs 774,720

Per-session cost breakdown

Token class	Tokens	$/1M	Line cost
Audio input	1,800	$32.00	$0.0576
Cached audio input	0	$0.40	$0.00
Audio output	2,400	$64.00	$0.1536
Text input	1,000	$4.00	$0.004
Per-session total			$0.2152

Same workload, every model — monthly USD

Gemini 2.5 Flash (Live API, native audio) Cheapest

$50.00/mo · Rs 15,000

OpenAI gpt-4o-mini-realtime

$66.60/mo · Rs 19,980

OpenAI gpt-realtime

$215.20/mo · Rs 64,560

All math runs in your browser. No usage data or API key leaves the page.

Sources cited

OpenAI gpt-realtime:https://openai.com/api/pricing/
OpenAI gpt-4o-mini-realtime:https://openai.com/api/pricing/
Gemini 2.5 Flash (Live API, native audio):https://ai.google.dev/gemini-api/docs/pricing

Audio tokenisation: OpenAI bills 1 token per 100 ms of input audio and 1 token per 50 ms of output audio; Gemini bills 25 tokens per second. Minutes-mode figures use these documented constants and are estimates — switch to token mode for cent-exact costs from your usage dashboard.

How it works

A normal LLM API bill has two lines: input tokens and output tokens. A speech-to-speech agent on the Realtime API is different — the provider meters four token classes, each on its own price tier:

Audio input — the user's speech, tokenised.
Cached audio input — re-sent context served from cache at a steep discount.
Audio output — the model's spoken reply, the most expensive class.
Text input — system prompt and instruction tokens.

The total per session is the sum of each class priced per 1,000,000 tokens:

cost = audioIn/1e6·rateIn + cachedIn/1e6·rateCached + audioOut/1e6·rateOut + textIn/1e6·rateText

When you enter minutes, audio tokens are derived first using each provider's documented tokenisation rate. OpenAI bills 1 token per 100 ms of input audio (600 tokens/minute) and 1 token per 50 ms of output audio (1,200 tokens/minute); Gemini bills 25 tokens per second (1,500 tokens/minute). Cached audio is carved out of audio input by the cached-share slider: cachedIn = audioIn × share, and the remainder is billed at the full input rate.

Monthly and annual figures are linear: monthly = perSession × sessions and annual = monthly × 12. LKR amounts multiply the USD result by your editable USD→LKR rate. Every per-token rate is pinned from the official OpenAI and Google pricing pages and carries a last-verified date (2026-06-07); nothing is fetched at runtime. To check the math, the page derives audio cost a second way — straight from the per-minute unit rate — and confirms it matches the token pipeline to the cent.

Worked examples

gpt-realtime · one support call (token mode)

Audio in 50,000 · cached 10,000 · audio out 40,000 · text 2,000

Audio in: 50,000 / 1e6 × $32.00 = $1.6000
Cached in: 10,000 / 1e6 × $0.40 = $0.0040
Audio out: 40,000 / 1e6 × $64.00 = $2.5600
Text in: 2,000 / 1e6 × $4.00 = $0.0080
Per session = $4.1720 → at Rs 300/USD = Rs 1,251.60

gpt-4o-mini-realtime · 1,000 short queries / month

Per query: audio in 1,500 · audio out 1,200 · text 500 · no cache

Audio in: 1,500 / 1e6 × $10.00 = $0.01500
Audio out: 1,200 / 1e6 × $20.00 = $0.02400
Text in: 500 / 1e6 × $0.60 = $0.00030
Per session = $0.03930 → × 1,000 = $39.30/month
At Rs 300/USD: Rs 11,790/month · annual $471.60 / Rs 141,480

gpt-realtime · minutes mode (cross-check)

3 min audio in + 2 min audio out, no cache, no text

Audio in: 3 × 600 = 1,800 tok → 1,800 / 1e6 × $32 = $0.0576
Audio out: 2 × 1,200 = 2,400 tok → 2,400 / 1e6 × $64 = $0.1536
Per session = $0.2112
Unit-rate check: $0.0192/min × 3 + $0.0768/min × 2 = $0.2112 ✓

Frequently asked questions

Sources & references

Per-token rates were last cross-checked against the official OpenAI and Google pricing pages on 2026-06-07. AI prices change often; the rates are reviewed quarterly and after any provider pricing update.

Related tools

LiveAI

AI API Cost Calculator

Estimate the monthly and per-request USD and LKR bill for any major LLM API. Pick a model, enter input/output tokens and requests per month, and compare every model cheapest-first — with optional 50% batch and cached-input discounts.

Open tool

LiveAI

AI Vision Token Calculator

Calculate how many tokens an image costs on GPT-4o, GPT-4o mini, Claude, and Gemini from its pixel dimensions — plus the per-image and total cost in USD and LKR, side by side. Runs entirely in your browser; the image is never uploaded.

Open tool

LiveAI

Embedding Cost Calc

Estimate USD and LKR cost of generating vector embeddings across OpenAI, Cohere, Voyage AI, Google Gemini, and Mistral. One-time indexing, monthly queries, and first-year totals side-by-side with output dimensions — every price sourced.

Open tool

Rate this tool

Be the first to rate

Comments & feedback

Spotted a bug or want an improvement? Tell us — our team reviews every comment, and good ideas get built. Comments are public and anonymous.

Found a bug, edge case, or want to suggest an improvement?

Email me at [email protected] — most fixes ship within 24 hours.