induwara.lk
induwara.lkAI · Machine learning

Confusion Matrix Calculator — Precision, Recall, F1 & MCC

Paste the four cells of a binary confusion matrix and get every standard classification metric at once — accuracy, precision, recall, specificity, F1, F-beta, balanced accuracy and the Matthews correlation coefficient — each shown with the exact formula it came from. Free, no signup, runs in your browser.

By Induwara AshinsanaUpdated Jun 7, 2026
Confusion matrix metrics

Enter the four counts

Predicted positive, actually positive.

Predicted positive, actually negative (Type I).

Predicted negative, actually positive (Type II).

Predicted negative, actually negative.

β = 1 is F1. β > 1 weights recall higher; β < 1 weights precision.

Cosmetic — names the positive class in the rendered matrix below.

Examples

Confusion matrix

Actual
PositiveNegativeTotal
Pred. Positive9010100
Pred. Negative595100
Total95105200
Accuracy
0.9250
92.5% · (TP + TN) / N
Precision
0.9000
90% · TP / (TP + FP)
Recall
0.9474
94.74% · TP / (TP + FN)
F1 score
0.9231
92.31% · 2·TP / (2·TP + FP + FN)
MCC
0.8511
(TP·TN − FP·FN) / √(…)
Balanced acc.
0.9261
92.61% · (TPR + TNR) / 2
F1 = 0.9231

Excellent balance of precision and recall.

MCC = 0.8511

Strong positive correlation with the true labels.

All metrics

MetricValue
Accuracy0.9250
Precision (PPV)0.9000
Recall / Sensitivity (TPR)0.9474
Specificity (TNR)0.9048
F1 score0.9231
F1 score0.9231
Balanced accuracy0.9261
Matthews corr. coef. (MCC)0.8511
Informedness (Youden's J)0.8521
Neg. predictive value (NPV)0.9500
False positive rate (FPR)0.0952
False negative rate (FNR)0.0526
False discovery rate (FDR)0.1000
Prevalence0.4750

“undefined” means the metric's denominator is zero (e.g. precision when TP + FP = 0) — reported honestly rather than shown as 0.

Computed entirely in your browser — nothing is uploaded. Formulas per scikit-learn and Wikipedia; last verified 2026-06-07.

How it works

A confusion matrix is the 2×2 table a binary classifier produces when you compare its predictions against the truth. It has four cells: true positives (TP) and true negatives (TN), where the model agreed with reality, and false positives (FP, a Type I error) and false negatives (FN, a Type II error), where it did not. Every metric on this page is derived from those four counts, with the sample size N = TP + FP + FN + TN.

The headline metrics follow the standard definitions used by scikit-learn and the Wikipedia confusion-matrix table:

  • Accuracy = (TP + TN) / N
  • Precision (PPV) = TP / (TP + FP)
  • Recall / Sensitivity (TPR) = TP / (TP + FN)
  • Specificity (TNR) = TN / (TN + FP)
  • F1 = 2·TP / (2·TP + FP + FN)
  • F-beta = (1 + β²)·P·R / (β²·P + R)

F1 is the harmonic mean of precision and recall, which is why it drops hard when either one is weak. The F-beta form, from van Rijsbergen's 1979 information-retrieval text, lets you weight recall β times as heavily as precision — β > 1 favours recall, β < 1 favours precision. This calculator computes F-beta from the raw counts and cross-checks it against the precision/recall form so the two always agree to the last decimal.

The Matthews correlation coefficient takes the whole table into account:

MCC = (TP·TN − FP·FN) / √((TP+FP)(TP+FN)(TN+FP)(TN+FN))

Because MCC uses all four cells — including the true negatives that precision, recall and F1 ignore — it stays trustworthy when one class vastly outnumbers the other. It ranges from −1 to +1, where 0 means the predictions are no better than chance. Balanced accuracy, (TPR + TNR) / 2, and informedness (Youden's J), TPR + TNR − 1, are two other imbalance-aware summaries shown in the full table. Each metric is computed independently from the integer counts, never from rounded intermediates, so no rounding error compounds. When a denominator is zero the metric is genuinely undefined (for example precision when nothing is predicted positive), and the tool labels it “undefined” rather than printing a misleading 0.

Worked examples

Example 1 — balanced classifier (TP=90, FP=10, FN=5, TN=95)

  1. N = 90 + 10 + 5 + 95 = 200
  2. Accuracy = (90 + 95) / 200 = 0.9250
  3. Precision = 90 / (90 + 10) = 0.9000
  4. Recall = 90 / (90 + 5) = 0.9474
  5. F1 = 2·90 / (2·90 + 10 + 5) = 180 / 195 = 0.9231
  6. MCC = (90·95 − 10·5) / √(100·95·105·100) = 8500 / 9987.49 = 0.8511

Example 2 — imbalanced data, the accuracy paradox (TP=5, FP=5, FN=15, TN=975)

  1. N = 5 + 5 + 15 + 975 = 1,000 (only 20 actual positives)
  2. Accuracy = (5 + 975) / 1000 = 0.9800 ← looks excellent
  3. Precision = 5 / (5 + 5) = 0.5000
  4. Recall = 5 / (5 + 15) = 0.2500
  5. F1 = 2·5 / (2·5 + 5 + 15) = 10 / 30 = 0.3333
  6. MCC = (5·975 − 5·15) / √(10·20·980·990) = 4800 / 13929.8 = 0.3446
  7. Verdict: 98% accuracy, but F1 and MCC reveal a weak classifier.

Example 3 — edge case, no predicted positives (TP=0, FP=0, FN=10, TN=90)

  1. N = 0 + 0 + 10 + 90 = 100
  2. Precision = 0 / (0 + 0) = 0/0 → undefined (nothing was predicted positive)
  3. Recall = 0 / (0 + 10) = 0.0000
  4. Accuracy = (0 + 90) / 100 = 0.9000 (high, but it never finds a positive)
  5. F1 = 2·0 / (2·0 + 0 + 10) = 0 / 10 = 0.0000
  6. MCC = (0·90 − 0·10) / √(0·…) = 0/0 → undefined

Frequently asked questions

Sources & references

Related tools

Rate this tool
Be the first to rate

Comments & feedback

Spotted a bug or want an improvement? Tell us — our team reviews every comment, and good ideas get built. Comments are public and anonymous.

Found a bug, an edge case, or want multi-class support added?

Email me at [email protected] — most fixes ship within 24 hours.