induwara.lk
induwara.lkAI · Text-to-speech

AI Text-to-Speech (TTS) API Comparison

Compare 18 hosted text-to-speech APIs — ElevenLabs, OpenAI, Google Cloud, Azure, Amazon Polly, PlayHT, Murf, Cartesia and Deepgram Aura — by price per character, voice cloning, streaming latency, language coverage and published naturalness. Enter your monthly volume and rank them by cost. Every figure cites the vendor source.

By Induwara AshinsanaUpdated Jun 21, 2026
Compare text-to-speech APIs18 models · 9 vendors

Providers to compare

4/6 selected (min 2)

Monthly text volume and options

= 500,000 characters per month.

Input unit
Currency
Quick volumes
Require
Cheapest
Neural2
$0.00/mo
Best quality (Elo)
Multilingual v2
Elo 1290
Lowest latency
None published
Most languages
Neural2
50 languages

Projected monthly cost (cheapest first)

#1Neural2
Cheapest
Google
$0.00/mo
$16.00/1M chars · 1,000,000 free chars/mo
50 langsBest free tier
#2tts-1
OpenAI
$7.50/mo
$15.00/1M chars
50 langs
#3Neural
Amazon
$8.00/mo
$16.00/1M chars
#4Multilingual v2
ElevenLabs
$45.00/mo
$90.00/1M chars (eff)
Elo 1290Professional cloning

Feature matrix

Provider$/1M charsCloningElo
Neural2
Google
$16.00No1120
tts-1
OpenAI
$15.00No1150
Neural
Amazon
$16.00No1100
Multilingual v2
ElevenLabs
$90.00effProfessional1290

“eff” marks an effective rate derived from credit or per-minute pricing — confirm against your own plan. Latency is the vendor-marketed streaming time-to-first-byte where published; “—” means not published. Elo is a community-published naturalness standing, indicative and benchmark-dependent — not a guarantee for your script or language.

Per-provider notes

  • Google Neural2:Current-generation Google neural voices with 1M free chars/mo. Strong multilingual coverage and SSML.pricing
  • OpenAI tts-1:The standard OpenAI voice model — low latency, cheap, no SSML or cloning. Good default for app voiceovers.pricing
  • Amazon Neural:Polly's neural voices — natural, full SSML, tight S3/Lambda integration. No standing free tier here.pricing
  • ElevenLabs Multilingual v2:Top of most naturalness leaderboards, with instant and professional voice cloning. Priciest per character.pricing
Static comparison — no text sent anywhere, no API key, no logging.

Picking a provider here sends nothing to any vendor and generates no audio. Rates are dated constants reviewed manually; confirm the current price on the linked pricing page before you commit. One word ≈ 6 characters and one spoken minute ≈ 900 characters are documented assumptions for the Words/Minutes units. LKR figures use a single indicative rate of Rs 300 per USD — not a live exchange rate.

How it works

Choosing a text-to-speech (TTS) provider is a multi-axis decision: price per character, voice naturalness, whether you need voice cloning, whether you need low-latency streaming for a live voice agent or just batch rendering for narration, language coverage, SSML support, and the commercial-use licence. This page lays all of those out for the 18 models that developers and creators most often shortlist, drawn from 9 vendors, and ranks them by what they would actually cost at your volume.

1. The cost formula

Almost every TTS provider bills per character of input text. The tool normalises your volume to characters and applies any standing free tier:

monthly_cost = max(0, characters − free_tier) ÷ 1,000,000 × usd_per_million

If you enter words or minutes instead of characters, they are converted first: one word ≈ 6 characters (≈5 letters plus a space), and one spoken minute ≈ 900 characters (≈150 words a minute × 6). The data module cross-checks every figure a second way — via the per-1,000-character rate — so the two routes must agree to the millionth of a dollar before the page will build.

2. Per-character, credit and per-minute pricing

The big clouds (OpenAI, Google, Azure, Amazon) publish a clean per-character rate. ElevenLabs, PlayHT, Murf and Cartesia sell credits instead, where the cost per character depends on your plan; those rows show an effectiveper-million-character rate (marked “eff”) for a representative tier. OpenAI's gpt-4o-mini-tts is priced per audio minute, converted here using the 900-characters-per-minute assumption. Every conversion is documented in the data file and labelled in the table so nothing is hidden.

3. Free tiers

Three providers here have a standing monthly free allowance, subtracted before billing: Google Cloud gives 4,000,000 free Standard characters and 1,000,000 free WaveNet/Neural2 characters every month; Azure's F0 tier gives 500,000 free characters a month. Amazon Polly's free tier only lasts 12 months on new accounts, so it is treated as zero here. At small volumes the free tier can make a pricier per-character rate the cheapest overall — the ranking accounts for that automatically.

4. Quality (Elo) is benchmark-dependent

The Elo column is each model's community-published naturalness standing (TTS-Arena and Artificial Analysis), rounded and indicative — never our own measurement. Higher is better. Listener preference, the target language, and the kind of script (conversational versus formal narration) move naturalness far more than the small gaps between leading models. Use the column to shortlist, then generate a sample of your own text on your finalists.

5. Best-for badges

The “Cheapest”, “Best quality (Elo)”, “Lowest latency” and “Most languages” callouts are derived deterministically from your current selection and volume — cheapest is the lowest projected monthly cost, best-quality is the highest published Elo, lowest-latency is the minimum published streaming time-to-first-byte among streaming-capable providers, and most-languages is the highest documented locale count. Requiring a feature greys out providers that lack it without deleting them, so the comparison stays honest.

Worked examples

Explainer-video freelancer — 500,000 characters/month

A Colombo freelancer narrating explainer videos generates ~500K characters of speech a month and wants the cheapest natural voice. USD, no free-tier effects at this size for the paid majors.

  1. Volume: 500,000 characters/month.
  2. OpenAI tts-1: 500,000 ÷ 1,000,000 × $15 = $7.50/mo.
  3. Amazon Polly Neural: 0.5 × $16 = $8.00/mo.
  4. Google Neural2: first 1M chars free → 500K is fully covered → $0.00/mo.
  5. ElevenLabs Multilingual v2: 0.5 × $90 = $45.00/mo.
  6. Cheapest is Google Neural2 at $0.00 (inside its 1M free tier). If the freelancer needs ElevenLabs-grade cloning, they knowingly pay $45 for the quality.

Audiobook pipeline — 5,000,000 characters/month, free tiers matter

A publisher rendering ~5M characters a month (a few full-length books). Shows how Google's free tiers and the Standard rate change the ranking. USD.

  1. Volume: 5,000,000 characters/month.
  2. Google Standard: first 4M free → bill 1M × $4 ÷ 1M = $4.00/mo.
  3. OpenAI tts-1: 5 × $15 = $75.00/mo.
  4. Google Neural2: first 1M free → bill 4M × $16 ÷ 1M = $64.00/mo.
  5. Amazon Polly Neural: no standing free tier → 5 × $16 = $80.00/mo.
  6. Google Standard wins at $4.00 — but its voices are the most robotic. For an audiobook you'd likely accept Neural2 at $64 for far better naturalness.

Edge case — free-tier boundary and zero volume

Testing the arithmetic exactly at Azure's 500,000-character free boundary, and at zero, so the math never produces a negative or NaN.

  1. At exactly 500,000 characters on Azure Neural: max(0, 500,000 − 500,000) = 0 billable → $0.00.
  2. At 500,001 characters: 1 billable × $16 ÷ 1,000,000 = $0.000016/mo.
  3. At 0 characters (or a blank/negative input): clamps to 0 → $0.00 for every provider, no NaN.
  4. Above 100,000,000 characters the tool flags the input as out of range rather than computing a misleading number.

Frequently asked questions

Sources & references

Every rate and capability flag was last cross-checked against these sources on 2026-06-21. Text-to-speech pricing, voices and models change frequently; this page is reviewed manually and whenever a provider announces a substantive pricing or model update. Quality Elo figures are community-published and benchmark-dependent.

Related tools

Rate this tool
Be the first to rate

Comments & feedback

Spotted a bug or want an improvement? Tell us — our team reviews every comment, and good ideas get built. Comments are public and anonymous.

Spot a stale price, a missing provider, or a misclaimed capability?

Email me at [email protected] — most fixes ship within 24 hours.