Expected Calibration Error (ECE) Calculator
Paste your model's prediction confidences and their 0/1 correctness to get the Expected Calibration Error and Maximum Calibration Error, a per-bin reliability table and diagram, and whether the model is over- or under-confident. Uses the equal-width binning of Guo et al. (2017), runs entirely in your browser, and needs no signup.
How it works
Calibration asks a simple question: when a classifier says it is 80% sure, is it right about 80% of the time? The Expected Calibration Error turns that into a single number by comparing stated confidence with observed accuracy across confidence bands. The binned estimator used here is the one popularised by Guo et al. (2017) and introduced by Naeini et al. (2015).
Each prediction contributes a confidence cᵢ ∈ [0, 1] — the probability of the predicted class — and a correctness yᵢ ∈ {0, 1}. The interval [0, 1] is split into M equal-width bins of width 1/M, and a confidence c lands in bin min(floor(c·M), M−1) so that c = 1.0 falls in the last bin.
ECE = Σₘ (|Bₘ| / N) · |acc(Bₘ) − conf(Bₘ)|
- Validate. Every confidence must lie in
[0, 1]and every label must be0or1. Rows that fail are listed with the line number and reason — never a silent NaN or a dropped sample. - Bin. For each bin
Bₘcompute the mean confidenceconf(Bₘ)and the accuracyacc(Bₘ)(fraction correct). - Weight and sum. ECE is the sample-weighted average of the absolute bin gaps; empty bins contribute 0. The worst single gap is the Maximum Calibration Error:
MCE = maxₘ |acc(Bₘ) − conf(Bₘ)|
- Verdict. Overall mean confidence and accuracy are computed directly from every sample (not from the bins), so the over- vs under-confidence verdict is exact and independent of
M. Mean confidence above accuracy means overconfident; below means underconfident.
In binary-probability mode the tool reduces a single positive-class probability p to a confidence the standard way: predicted class = (p ≥ 0.5), confidence = max(p, 1 − p), and correct = (predicted == true label). As an internal correctness gate, ECE is also recomputed by the algebraic identity (1/N) Σₘ |Σyᵢ − Σcᵢ|over each bin's raw sums, and the two values are asserted equal to floating-point precision before you see a result. Because ECE depends on the bin count, the chosen M is shown beside every number.
Worked examples
Frequently asked questions
Sources & references
- Guo, Pleiss, Sun & Weinberger (2017) — On Calibration of Modern Neural Networks, ICML (ECE/MCE via equal-width binning)
- Naeini, Cooper & Hauskrecht (2015) — Obtaining Well Calibrated Probabilities Using Bayesian Binning, AAAI
- Niculescu-Mizil & Caruana (2005) — Predicting Good Probabilities With Supervised Learning, ICML (reliability diagrams)
The formulas on this page were last cross-checked against these sources on 2026-06-19. ECE and MCE are stable mathematical definitions, so this tool needs no rate or schedule updates — only the worked examples are periodically re-reconciled against the binned estimator.
Related tools
Comments & feedback
Spotted a bug or want an improvement? Tell us — our team reviews every comment, and good ideas get built. Comments are public and anonymous.
Found a bug, edge case, or want to suggest an improvement?
Email me at [email protected] — most fixes ship within 24 hours.