Perplexity Calculator
Compute language-model perplexity in your browser — from a list of token probabilities, a cross-entropy / NLL loss, or a total log-likelihood. See the cross-entropy in nats and bits-per-token, the average token probability, and the exact formula behind every result.
How it works
Perplexitymeasures how well a probability model predicts a sample of text: it is the model's average uncertainty per token, read as the number of equally likely options it is effectively choosing between. Lower is better. The definition comes from Jurafsky & Martin's Speech and Language Processing, Chapter 3.
For a test sequence of N tokens, where the model assigns probability pᵢ to the i-th observed token in context, perplexity is the inverse geometric mean of those probabilities:
PP = ( ∏ pᵢ )^(−1/N) = exp( −(1/N) · Σ ln pᵢ )
The exponent −(1/N)·Σ ln pᵢ is the average cross-entropy (equivalently, the mean negative log-likelihood) H, in nats. So perplexity is simply the exponential of the cross-entropy, and the two are interchangeable:
- From probabilities. Sum the natural logs of the per-token probabilities, average and negate to get
H = −(1/N)·Σ ln pᵢ, thenPP = exp(H). If you enter log-probabilities directly, the logs are already taken. - From loss. An average cross-entropy / NLL loss already is
H. In nats — PyTorchCrossEntropyLoss, TensorFlow —PP = exp(loss). In bits,PP = 2^loss. - From log-likelihood. Given a total
Σ log Pand token countN, the per-token cross-entropy isH = −(Σ log P)/Nin the chosen unit, and PP is its exponential (base e for nats, base 2 for bits).
Units convert with H_bits = H_nats / ln 2, so log₂ PP is the bits-per-token figure and 1/PP = exp(−H_nats) is the average per-token probability. All three input modes converge on the same (PP, nats, bits) triple, which is why the tool can cross-check a probabilities-mode result against the independent product form (∏ pᵢ)^(−1/N) and have them agree to floating-point precision. Probabilities of 0 or below are rejected, because ln 0 = −∞ would send perplexity to infinity. Everything is plain double-precision arithmetic in your browser.
Worked examples
Frequently asked questions
Sources & references
- Jurafsky & Martin — Speech and Language Processing (3rd ed. draft), Ch. 3: N-gram Language Models
- Hugging Face Transformers — Perplexity of fixed-length models
- PyTorch documentation — torch.nn.CrossEntropyLoss (mean NLL in nats)
The formulas on this page were last cross-checked against these sources on 2026-06-10. Perplexity is a stable mathematical definition, so this tool needs no rate or schedule updates — only the worked examples are periodically re-reconciled.
Related tools
Comments & feedback
Spotted a bug or want an improvement? Tell us — our team reviews every comment, and good ideas get built. Comments are public and anonymous.
Found a bug, edge case, or want to suggest an improvement?
Email me at [email protected] — most fixes ship within 24 hours.