Adjusted Rand Index (ARI) Calculator
Paste two clustering label lists — a clustering and its ground truth, or two clusterings — and get the Adjusted Rand Index, the raw Rand Index and the Fowlkes–Mallows index, with the full pair-count and contingency working shown. Matches scikit-learn. Free, no signup, runs in your browser.
How it works
The Adjusted Rand Index compares two ways of grouping the same set of items — for example the clusters a k-means run produced versus the true class labels. It works at the level of pairs of items. For every one of the C(n,2) pairs, the two clusterings either place the pair in the same cluster or in different clusters, and the metric counts how often they make the same call.
Concretely, you build a contingency table where n_ij is the number of items in cluster i of partition A and cluster j of partition B. Writing the row sums as a_i, the column sums as b_j, and C(x,2) = x(x−1)/2, the calculation follows scikit-learn and Hubert & Arabie (1985):
- Index = Σ C(n_ij, 2)
- sumA = Σ C(a_i, 2), sumB = Σ C(b_j, 2)
- Expected = sumA · sumB / C(n, 2)
- Max = ½ (sumA + sumB)
- ARI = (Index − Expected) / (Max − Expected)
The subtraction of Expectedis what makes the index “adjusted”: it removes the agreement you would get from random labeling, so a chance clustering scores about 0 instead of the inflated value a raw Rand Index gives. The raw Rand Index itself is RI = (a + d) / C(n,2), where a = Index (pairs together in both) and d = C(n,2) − sumA − sumB + Index (pairs apart in both). The Fowlkes–Mallows index is the geometric mean of pair precision and recall, FM = Index / √(sumA · sumB).
ARI ranges from −0.5 to 1: 1 is a perfect match (identical partitions up to relabeling), 0 is what chance predicts, and negative values mean the two clusterings disagree more than random. Because the comparison only depends on which items share a cluster, it is permutation-invariant — renaming clusters changes nothing. One special case: when both partitions are structureless (every item alone, or all items together), the formula is 0/0 and scikit-learn defines the result as 1.0; this tool follows the same convention and flags it. All pair counts stay exact integers, so the results match scikit-learn to full double precision, and every ARI is independently re-derived from the four pair counts before it is shown.
Worked examples
Frequently asked questions
Sources & references
- scikit-learn — Clustering performance evaluation: adjusted_rand_score, rand_score, fowlkes_mallows_score — the canonical definitions and the ARI ∈ [−0.5, 1] range.
- Hubert, L. & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2, 193–218 — the original adjusted-for-chance Rand Index.
- Fowlkes, E. B. & Mallows, C. L. (1983). A Method for Comparing Two Hierarchical Clusterings. JASA, 78(383), 553–569 — the Fowlkes–Mallows index.
Every formula on this page was cross-checked against these sources on 2026-06-11, and each ARI is verified against an independent pair-count formula inside the tool. Your label lists never leave your browser.
Related tools
Comments & feedback
Spotted a bug or want an improvement? Tell us — our team reviews every comment, and good ideas get built. Comments are public and anonymous.
Found a bug, an edge case, or want Normalized Mutual Information added next?
Email me at [email protected] — most fixes ship within 24 hours.