induwara.lk
induwara.lkAI · Machine learning

Brier Score Calculator

Paste your forecast probabilities and the actual 0/1 outcomes to get the Brier score — the mean squared error of your probabilities — plus the Brier Skill Scoreversus a baseline, the formula, and a per-prediction breakdown. It matches scikit-learn's brier_score_loss, runs entirely in your browser, and needs no signup.

By Induwara AshinsanaUpdated Jun 11, 2026
Brier score calculator

Each forecast as a probability in [0, 1]. Separate with commas, spaces, or new lines.

The realised outcome for each forecast: 1 if the event happened, 0 if not.

Skill-score baseline

The reference forecast the Brier Skill Score is measured against.

Presets
Brier score
0.0750
Mean squared error (0 best, 1 worst)
Skill score (BSS)
0.6000
1 − BS / BS_ref
Base rate
0.7500
Mean outcome = 75% positive
Pairs (N)
4
Reference Brier 0.1875

Brier score 0.0750 on a 0 (best) to 1 (worst) scale — better than the base-rate baseline of 0.1875. Skill score 0.6000: 60.0% better than the baseline (reference Brier 0.1875).

Decimals

Formulas

  • BS = (1/N) Σ (fᵢ − oᵢ)²
  • ō = (1/N) Σ oᵢ  (base rate)
  • BS_ref = (1/N) Σ (r − oᵢ)²
  • BSS = 1 − BS / BS_ref

Cross-check. The direct mean-squared-error gives BS = 0.0750; the independent per-class split Σ(1−f)² over positives + Σf² over negatives gives 0.0750. They reconcile, as they must — the result is verified.

Per-prediction breakdown

#Probability fOutcome oSquared error (f − o)²
10.900010.0100
20.800010.0400
30.300000.0900
40.600010.1600
Brier score = mean squared error0.0750

Method: BS = (1/N) Σ (fᵢ − oᵢ)² (Brier 1950, matching scikit-learn brier_score_loss), with BSS = 1 − BS / BS_ref (US National Weather Service). Sources cited below the calculator. No data leaves this page.

How it works

The Brier score grades probabilistic forecasts. Instead of asking whether a hard label was right, it measures how far each predicted probability sat from what actually happened. It was defined by Glenn Brier in 1950 for weather verification and is identical to the mean squared error of the probabilities — the same quantity scikit-learn returns from brier_score_loss.

With N forecasts, each a probability fᵢ ∈ [0, 1] of a binary event whose actual outcome is oᵢ ∈ {0, 1}:

BS = (1/N) Σ (fᵢ − oᵢ)²

  1. Validate. Every probability must lie in [0, 1], every outcome must be 0 or 1, and the two lists must be the same length. Bad input gets a specific message, never a silent NaN.
  2. Score. Square each gap (fᵢ − oᵢ)² and average them. For binary outcomes this lands in [0, 1]; 0 is a perfect, fully-confident-and-correct forecaster.
  3. Baseline. Compute the reference Brier score for a constant forecast r:

    BS_ref = (1/N) Σ (r − oᵢ)²

    With the base rate r = ō (the mean outcome) this simplifies to the outcome variance ō(1 − ō) — the score of a climatology forecaster that always predicts the long-run frequency.
  4. Skill. The Brier Skill Score rescales the Brier score against that baseline:

    BSS = 1 − BS / BS_ref

    Above 0 the model beats the baseline; 0 ties it; below 0 it is worse than just predicting the reference. When the baseline is itself perfect (BS_ref = 0, every outcome identical) the skill score is undefined and the tool shows “—” rather than dividing by zero.

As an internal correctness gate the tool also recomputes the Brier score a second way — splitting the sum by class into Σ(1 − fᵢ)² over the positive cases plus Σfᵢ² over the negatives — and asserts the two agree to floating-point precision. The two forms are algebraically identical because oᵢ²= oᵢ for binary outcomes, so any disagreement would signal a bug.

Worked examples

Four forecasts — Brier 0.0750, skill 0.6000 (the Demo preset)

  1. Probabilities f = [0.9, 0.8, 0.3, 0.6], outcomes o = [1, 1, 0, 1]. N = 4
  2. Squared errors: (0.9−1)²=0.01, (0.8−1)²=0.04, (0.3−0)²=0.09, (0.6−1)²=0.16
  3. Sum = 0.30 → Brier score = 0.30 / 4 = 0.0750
  4. Base rate ō = 3/4 = 0.75; BS_ref = ō(1−ō) = 0.75·0.25 = 0.1875
  5. Brier Skill Score = 1 − 0.0750 / 0.1875 = 1 − 0.40 = 0.6000
  6. Read-out: 60% better than always predicting the base rate

Custom baseline 0.5 on the same data — skill 0.7000

  1. Same forecasts, but the reference is a fixed 0.5 (a coin flip), not the base rate
  2. BS_ref = mean((0.5 − o)²) = (0.25·3 + 0.25·1) / 4 = 0.25
  3. Brier Skill Score = 1 − 0.0750 / 0.25 = 1 − 0.30 = 0.7000
  4. Against an uninformed 0.5 forecaster the model looks even stronger

The bounds — perfect 0.0000 and worst 1.0000

  1. Perfect: f = [1, 0, 1] against o = [1, 0, 1] → every squared error 0 → Brier = 0.0000
  2. Worst (binary): f = [0, 1] against o = [1, 0] → (0−1)² + (1−0)² = 2
  3. Brier = 2 / 2 = 1.0000 — the maximum a binary forecaster can score
  4. Edge case: outcomes all identical (e.g. all 1) make BS_ref = 0, so the skill score is shown as '—' rather than dividing by zero

Frequently asked questions

Sources & references

The formulas on this page were last cross-checked against these sources on 2026-06-11. The Brier score is a stable mathematical definition, so this tool needs no rate or schedule updates — only the worked examples are periodically re-reconciled against scikit-learn.

Related tools

Rate this tool
Be the first to rate

Comments & feedback

Spotted a bug or want an improvement? Tell us — our team reviews every comment, and good ideas get built. Comments are public and anonymous.

Found a bug, edge case, or want to suggest an improvement?

Email me at [email protected] — most fixes ship within 24 hours.