induwara.lk
induwara.lkStatistics · Data science

Pearson Correlation Coefficient Calculator

Paste two columns of numbers and get the Pearson correlation r, r², covariance, a significance test (t-statistic and two-tailed p-value), and a scatter plot with the full step-by-step working. Matched to scipy.stats.pearsonr, runs entirely in your browser — no signup, nothing uploaded.

By Induwara AshinsanaUpdated Jun 10, 2026
Pearson correlation calculator

Numbers separated by commas, spaces, or new lines. Paste two Excel columns here to fill both.

Must have the same count as X — each X needs a matching Y.

Examples
Covariance / SD
Decimals
Pearson r
-0.9948
Range −1 to 1 · n = 5
r² (determination)
0.9897
99.0% of variance explained
Covariance
-4.2500
Sxy / (n−1)
Strength
Very strong negative

Scatter plot

with least-squares trend line (y = -1.700x + 11.700)
XY0.765.242.5810.42(1.0000, 10.0000)(2.0000, 8.0000)(3.0000, 7.0000)(4.0000, 5.0000)(5.0000, 3.0000)

Significance test

t-statistic
-17.0000
Degrees of freedom
3
p-value (two-tailed)
0.0004
At α = 0.05Significant

Cross-check. The deviation-score formula gives r = -0.9948; the independent raw-score formula [n·Σxy − ΣxΣy] / √(…) gives -0.9948. They reconcile, as they must — and both match scipy.stats.pearsonr.

Step-by-step working

#xᵢyᵢxᵢ−x̄yᵢ−ȳ(xᵢ−x̄)(yᵢ−ȳ)(xᵢ−x̄)²(yᵢ−ȳ)²
11.000010.0000-2.00003.4000-6.80004.000011.5600
22.00008.0000-1.00001.4000-1.40001.00001.9600
33.00007.00000.00000.40000.00000.00000.1600
44.00005.00001.0000-1.6000-1.60001.00002.5600
55.00003.00002.0000-3.6000-7.20004.000012.9600
Σ15.000033.0000-17.000010.000029.2000
x̄ = 15.0000 / 5 = 3.0000 · ȳ = 33.0000 / 5 = 6.6000
SD(X) = 1.5811 · SD(Y) = 2.7019 (sample)
r = -17.0000 / √(10.0000 × 29.2000) = -0.9948

Method: r = Σ(xᵢ−x̄)(yᵢ−ȳ) / √(Σ(xᵢ−x̄)²·Σ(yᵢ−ȳ)²); significance via t = r√(n−2)/√(1−r²) with df = n−2 — NIST e-Handbook §1.3.5.13, matched to scipy.stats.pearsonr. Nothing leaves this page.

How it works

The Pearson product-moment correlation coefficient r measures the strength and direction of the linear relationship between two paired variables. It runs from −1 (a perfect decreasing line), through 0 (no linear association), to +1 (a perfect increasing line). The definition is the one in the NIST/SEMATECH e-Handbook of Statistical Methods §1.3.5.13.

For n paired observations, with means x̄ = (Σxᵢ)/n and ȳ = (Σyᵢ)/n, the coefficient is the sum of cross-products of the deviations divided by the root of the product of the squared deviations:

r = Σ(xᵢ−x̄)(yᵢ−ȳ) / √( Σ(xᵢ−x̄)² · Σ(yᵢ−ȳ)² ) = Sxy / √(Sxx · Syy)

The tool computes this in four steps:

  1. Means and deviations. It averages each column, then subtracts the mean from every value to get the deviations (xᵢ−x̄) and (yᵢ−ȳ).
  2. Sums. It accumulates Sxy, Sxx, and Syy from the deviation table. If Sxx or Syy is zero — a constant column — r is undefined, so the tool shows a clear message instead of a divide-by-zero.
  3. r and r². The correlation is Sxy / √(Sxx·Syy), and r² (the coefficient of determination) is r squared — the share of variance in one variable explained by a linear fit on the other. Covariance is Sxy/(n−1) for a sample (or Sxy/n for a whole population); r itself is unaffected by that choice.
  4. Significance. Under the null hypothesis that the true correlation is zero, t = r√(n−2)/√(1−r²) follows a Student-t distribution with df = n−2. The two-tailed p-value is the regularized incomplete beta function I_x(df/2, 1/2) at x = df/(df+t²), the exact identity SciPy uses.

As a credibility check the calculator also recomputes r a second way — the raw-score formula [n·Σxy − ΣxΣy] / √(…) — and confirms both routes agree to floating-point precision, matching scipy.stats.pearsonr. Pearson's r assumes a roughly linear relationship; for ranked or monotonic-but-curved data, a rank correlation such as Spearman fits better. And a strong r is evidence of association, never of causation on its own.

Worked examples

Classic positive — X = [1, 2, 3, 4, 5], Y = [2, 4, 5, 4, 5]

  1. Means: x̄ = 15/5 = 3, ȳ = 20/5 = 4
  2. Sxy = (−2)(−2)+(−1)(0)+(0)(1)+(1)(0)+(2)(1) = 4+0+0+0+2 = 6
  3. Sxx = 4+1+0+1+4 = 10; Syy = 4+0+1+0+1 = 6
  4. r = 6 / √(10·6) = 6/√60 = 0.7746; r² = 0.6000 (60% of variance)
  5. t = 0.7746·√3/√0.4 = 2.1213, df = 3, p = 0.1240 → not significant at α=0.05

Strong negative — study hours X = [1, 2, 3, 4, 5] vs exam errors Y = [10, 8, 7, 5, 3]

  1. Means: x̄ = 3, ȳ = 33/5 = 6.6
  2. Deviations y: 3.4, 1.4, 0.4, −1.6, −3.6
  3. Sxy = −6.8−1.4+0−1.6−7.2 = −17; Sxx = 10; Syy = 29.2
  4. r = −17 / √(10·29.2) = −17/√292 = −0.9948; r² = 0.9897
  5. t = −17.0000, df = 3, p = 0.000443 → significant: more study, fewer errors

Weak / not significant — X = [−2, −1, 0, 1, 2], Y = [0.5, 0.2, 0.0, 0.3, 0.9]

  1. Means: x̄ = 0, ȳ = 1.9/5 = 0.38
  2. Sxy = (−2)(0.12)+(−1)(−0.18)+0+(1)(−0.08)+(2)(0.52) = 0.90
  3. Sxx = 10; Syy = 0.468
  4. r = 0.90 / √(10·0.468) = 0.90/√4.68 = 0.4160; r² = 0.1731
  5. t = 0.7924, df = 3, p = 0.4860 → a weak hint, not significant at α=0.05

Frequently asked questions

Sources & references

The formulas on this page were last cross-checked against these sources on 2026-06-10. Pearson's r is a stable mathematical definition, so this tool needs no rate or schedule updates — only the worked examples are periodically re-reconciled against SciPy.

Related tools

Rate this tool
Be the first to rate

Comments & feedback

Spotted a bug or want an improvement? Tell us — our team reviews every comment, and good ideas get built. Comments are public and anonymous.

Found a bug, edge case, or want to suggest an improvement?

Email me at [email protected] — most fixes ship within 24 hours.