A/B Test Statistical Significance Calculator
Enter the visitors and conversions for your control and your variant. This tool runs a two-proportion z-test and tells you the p-value, the confidence level, and — in plain English — whether the difference is a real winner or just noise. No signup, runs entirely in your browser.
How it works
When version B converts better than version A, the gap could be a genuine improvement or it could be random luck. A significance test puts a number on that doubt. This calculator uses the two-proportion z-test with a pooled standard error, exactly as defined in the NIST/SEMATECH e-Handbook of Statistical Methods (§7.2.4). Every step runs client-side on plain arithmetic.
- Conversion rates. Control rate
p_a = c_a / n_aand variant ratep_b = c_b / n_b, where c is conversions and n is visitors. - Relative uplift.
(p_b − p_a) / p_a— the headline "X% better" figure. - Pooled proportion.
p = (c_a + c_b) / (n_a + n_b). The test assumes, for argument's sake, that both versions share this rate. - Pooled standard error.
SE = √( p·(1 − p)·(1/n_a + 1/n_b) ). - Test statistic.
z = (p_b − p_a) / SE— how many standard errors apart the two rates are. - p-value.The normal CDF Φ(z) comes from the Abramowitz & Stegun 7.1.26 approximation of the error function (accurate to 1.5×10⁻⁷):
Φ(z) = 0.5·(1 + erf(z/√2)). A two-tailed p-value is2·(1 − Φ(|z|)); one-tailed is1 − Φ(z)in the observed direction. - Confidence and verdict. Confidence is
1 − p. The result is significant when the p-value falls below your alpha (1 − threshold): 0.10, 0.05, or 0.01. - Confidence interval on the gap. Using the unpooled standard error (NIST §1.3.5.2),
(p_b − p_a) ± z*·√( p_a(1−p_a)/n_a + p_b(1−p_b)/n_b ), with z* = 1.645 / 1.960 / 2.576 for 90 / 95 / 99%. If the interval excludes zero, the difference is real at that level.
To guard against arithmetic mistakes, each result is cross-checked against Pearson's chi-square test of independence on the same 2×2 table. The two are algebraically identical for a single comparison, so χ² must equal z² — the calculator confirms this on every run before showing you a verdict.
Worked examples
Frequently asked questions
Sources & references
- NIST/SEMATECH e-Handbook §7.2.4 — Comparing two proportions (two-proportion z-test)
- NIST/SEMATECH e-Handbook §1.3.5.2 — Confidence limits for the mean
- NIST/SEMATECH e-Handbook §1.3.6.7.1 — Standard-normal critical (z) values
- Abramowitz & Stegun 7.1.26 — rational approximation of erf(x)
The formulas on this page were last cross-checked against the NIST e-Handbook on 2026-06-11, and every calculation is verified at runtime against Pearson's chi-square test (χ² = z²).
Related tools
Comments & feedback
Spotted a bug or want an improvement? Tell us — our team reviews every comment, and good ideas get built. Comments are public and anonymous.
Found a bug, edge case, or want to suggest an improvement?
Email me at [email protected] — most fixes ship within 24 hours.