Brier Score Calculator
Paste your forecast probabilities and the actual 0/1 outcomes to get the Brier score — the mean squared error of your probabilities — plus the Brier Skill Scoreversus a baseline, the formula, and a per-prediction breakdown. It matches scikit-learn's brier_score_loss, runs entirely in your browser, and needs no signup.
How it works
The Brier score grades probabilistic forecasts. Instead of asking whether a hard label was right, it measures how far each predicted probability sat from what actually happened. It was defined by Glenn Brier in 1950 for weather verification and is identical to the mean squared error of the probabilities — the same quantity scikit-learn returns from brier_score_loss.
With N forecasts, each a probability fᵢ ∈ [0, 1] of a binary event whose actual outcome is oᵢ ∈ {0, 1}:
BS = (1/N) Σ (fᵢ − oᵢ)²
- Validate. Every probability must lie in
[0, 1], every outcome must be0or1, and the two lists must be the same length. Bad input gets a specific message, never a silent NaN. - Score. Square each gap
(fᵢ − oᵢ)²and average them. For binary outcomes this lands in[0, 1]; 0 is a perfect, fully-confident-and-correct forecaster. - Baseline. Compute the reference Brier score for a constant forecast
r:BS_ref = (1/N) Σ (r − oᵢ)²
With the base rater = ō(the mean outcome) this simplifies to the outcome varianceō(1 − ō)— the score of a climatology forecaster that always predicts the long-run frequency. - Skill. The Brier Skill Score rescales the Brier score against that baseline:
BSS = 1 − BS / BS_ref
Above 0 the model beats the baseline; 0 ties it; below 0 it is worse than just predicting the reference. When the baseline is itself perfect (BS_ref = 0, every outcome identical) the skill score is undefined and the tool shows “—” rather than dividing by zero.
As an internal correctness gate the tool also recomputes the Brier score a second way — splitting the sum by class into Σ(1 − fᵢ)² over the positive cases plus Σfᵢ² over the negatives — and asserts the two agree to floating-point precision. The two forms are algebraically identical because oᵢ²= oᵢ for binary outcomes, so any disagreement would signal a bug.
Worked examples
Frequently asked questions
Sources & references
- Brier, G. W. (1950) — Verification of Forecasts Expressed in Terms of Probability, Monthly Weather Review 78(1)
- scikit-learn — sklearn.metrics.brier_score_loss (the binary implementation matched here)
- NOAA / US National Weather Service — Forecast Verification Glossary (Brier Skill Score)
The formulas on this page were last cross-checked against these sources on 2026-06-11. The Brier score is a stable mathematical definition, so this tool needs no rate or schedule updates — only the worked examples are periodically re-reconciled against scikit-learn.
Related tools
Comments & feedback
Spotted a bug or want an improvement? Tell us — our team reviews every comment, and good ideas get built. Comments are public and anonymous.
Found a bug, edge case, or want to suggest an improvement?
Email me at [email protected] — most fixes ship within 24 hours.