induwara.lk
induwara.lkAI · Speech Recognition

Word Error Rate (WER) & CER Calculator

Paste a reference transcript and a model hypothesis to get Word Error Rate, Character Error Rate, word accuracy, and a colour-coded alignment of every error — substitution, deletion, and insertion. Uses the NIST SCTK formula. No signup, no upload, runs entirely in your browser.

By Induwara AshinsanaUpdated Jun 9, 2026
Word Error Rate & CER
Stays on your device.0 words
Stays on your device.0 words
Options
Examples
WER
CER
Word accuracy
Total word errors

Runs entirely in your browser — transcripts are never uploaded, logged, or stored. Method: minimum-edit-distance (Levenshtein) alignment, WER = (S + D + I) / N, per the NIST SCTK / sclite definition and the HuggingFace evaluate metric. Up to 100,000 characters per box.

How it works

Word Error Rate is the standard accuracy metric for automatic speech recognition (ASR). It compares a system's output (the hypothesis) against a correct, human-verified transcript (the reference) and reports the fraction of reference words the system got wrong. The formula, defined by the NIST Speech Recognition Scoring Toolkit, is:

WER = (S + D + I) / N

Here S is substitutions (a word recognised as a different word), D is deletions (a reference word the system missed), I is insertions (an extra word the system added), and N is the total number of words in the reference. The three error types are not counted by eye — they come from an optimal alignment of the two transcripts:

  1. Normalise. Both transcripts are optionally lowercased, stripped of Unicode punctuation and symbols, and have their whitespace collapsed, then split into word tokens. Benchmarks normalise this way because a speech model should not be penalised for capitalisation or commas.
  2. Align. The reference and hypothesis word sequences are aligned with the minimum-edit-distance (Levenshtein) algorithm, which fills a dynamic-programming table where each cell holds the fewest edits needed to turn one prefix into the other. Each substitution, deletion, and insertion costs one; a match costs zero.
  3. Backtrace. Walking the table back from the bottom-right corner reconstructs the cheapest sequence of edits, which yields the exact S, D, and I counts and the word-by-word alignment shown in the diff view.
  4. Score. WER is the total errors divided by N. If the reference is empty, WER is undefined (division by zero), so the tool shows a guard message instead of a number.

Character Error Rate (CER) repeats the same alignment at the character level — CER = (S + D + I) / N over characters — which is gentler on near-miss spellings and suits scripts without clear word boundaries. Word accuracy is max(0, 1 − WER). As a cross-check, the tool computes the word-level edit distance a second time with an independent space-efficient pass; when that total matches the count from the alignment backtrace, the result is marked cross-checked. This is the same definition used by the Python jiwer library and the HuggingFace evaluate WER metric, so numbers are directly comparable when normalisation matches.

Worked examples

One substitution → 25% WER

Reference
the quick brown fox
Hypothesis
the quick brown box
  1. Reference words (N): the, quick, brown, fox → N = 4
  2. Align: the=the, quick=quick, brown=brown, fox→box
  3. fox→box is 1 substitution. S=1, D=0, I=0
  4. WER = (1 + 0 + 0) / 4 = 0.25 → 25%
  5. Word accuracy = max(0, 1 − 0.25) = 75%

Two deletions → 28.57% WER

Reference
I am going to the market today
Hypothesis
I going to market today
  1. Reference words (N): I, am, going, to, the, market, today → N = 7
  2. Align: 'am' and 'the' have no hypothesis match → 2 deletions
  3. S=0, D=2, I=0
  4. WER = (0 + 2 + 0) / 7 = 0.2857… → 28.57%
  5. Word accuracy = 71.43%

Substitution + insertion → 50% WER

Reference
she sells sea shells
Hypothesis
she sell the sea shells
  1. Reference words (N): she, sells, sea, shells → N = 4
  2. Align: 'sells' becomes 'sell' (substitution), 'the' is extra (insertion)
  3. S=1, D=0, I=1
  4. WER = (1 + 0 + 1) / 4 = 0.50 → 50%
  5. CER spot-check: ref 'cat' vs 'car' → t→r → 1/3 = 33.33%

Frequently asked questions

Sources & references

The WER and CER formulas and the four worked examples on this page were last reconciled against the SCTK definition and the HuggingFace metric docs on 2026-06-09. The calculation module ships with a built-in assertion that re-runs every worked example, so a regression in the alignment math fails fast.

Related tools

Rate this tool
Be the first to rate

Comments & feedback

Spotted a bug or want an improvement? Tell us — our team reviews every comment, and good ideas get built. Comments are public and anonymous.

Found a bug, edge case, or want to suggest an improvement?

Email me at [email protected] — most fixes ship within 24 hours.