How do you calculate cosine similarity between two vectors?

Take the dot product of the two vectors — sum of aᵢ·bᵢ across all components — then divide by the product of their magnitudes, where each magnitude is the square root of its sum of squares. In symbols: cos(A,B) = (A·B) / (‖A‖·‖B‖). The result is between −1 and 1. For example, [1,2,3] and [2,4,6] give 28 / (√14·√56) = 1.

What is the difference between cosine similarity and cosine distance?

Cosine similarity measures how aligned two vectors are, from 1 (same direction) through 0 (orthogonal) to −1 (opposite). Cosine distance is simply 1 − similarity, so it runs from 0 for identical directions up to 2 for opposite ones. scikit-learn uses this same 1 − similarity convention. Use similarity when higher should mean more alike, and distance when lower should mean more alike.

Why is cosine similarity used for text and embeddings?

Cosine similarity ignores vector length and only compares direction, so a long document and a short one about the same topic still score as similar. Word-count and embedding vectors vary a lot in magnitude, and that magnitude usually reflects length or frequency rather than meaning. Comparing direction instead keeps the focus on what the vectors are about, which is why search and recommendation systems rely on it.

Can cosine similarity be negative?

Yes, for vectors that can have negative components — such as word embeddings. A negative cosine means the vectors point in broadly opposite directions, with −1 being exactly opposite. For raw term-frequency vectors, where every count is zero or positive, cosine similarity can never be negative: it stays between 0 and 1, because two non-negative vectors can at most be orthogonal, never opposed.

What does a cosine similarity of 0.8 mean?

It means the two vectors are highly aligned: the angle between them is arccos(0.8) ≈ 36.9°, well under a right angle. In a retrieval system, 0.8 usually marks a strong match — the items are about the same thing. There is no universal cut-off, though; sensible thresholds depend on your embedding model and data, so calibrate them on examples you have already judged by hand.

What does a cosine similarity of 0 mean?

Zero means the vectors are orthogonal — at a 90° angle — with no shared direction. For text, that happens when two documents have no words in common: every term that appears in one is absent from the other, so the dot product is 0. Orthogonal vectors are treated as unrelated, which is the neutral midpoint between perfectly similar (1) and perfectly opposite (−1).

Why must the two vectors have the same length?

The dot product pairs up components position by position — a₁ with b₁, a₂ with b₂, and so on — so both vectors need the same number of components for the pairing to make sense. Comparing a 3-dimensional vector with a 4-dimensional one is undefined. In text mode this is handled automatically: both texts are projected onto one shared vocabulary, so the two term-frequency vectors always match in length.

Does this calculator send my data anywhere?

No. Parsing your numbers or text, summing the products, taking the square roots and the arccos — all of it runs in your browser with plain JavaScript. Nothing is uploaded, logged, or stored, so you can paste real embedding values or private text without concern. The page also keeps working offline once it has loaded.

AI · Machine learning

Cosine Similarity Calculator

Find the cosine similarity between two vectors or two texts, in your browser. See the similarity score, the cosine distance, the angle in degrees, and the full dot-product and magnitude working behind every result. No signup, nothing uploaded.

By Induwara Ashinsana— Executive Director, Ryzera TechnologiesUpdated Jun 10, 2026

Cosine similarity calculator

Vector A

Numbers separated by commas, spaces, or new lines.

Vector B

Must have the same number of components as A.

Examples

Cosine similarity

1.0000

Range −1 to 1 · 3-D

Cosine distance

0.0000

1 − similarity

Angle

0.00°

0.0000 rad

Interpretation

Identical direction

Decimals

Cross-check. The quotient form (A·B)/(‖A‖‖B‖) gives 1.0000; the independent unit-vector form Â·B̂ — how scikit-learn computes it — gives 1.0000. They reconcile, as they must.

Step-by-step working

Component	aᵢ	bᵢ	aᵢ·bᵢ	aᵢ²	bᵢ²
#1	1.0000	2.0000	2.0000	1.0000	4.0000
#2	2.0000	4.0000	8.0000	4.0000	16.0000
#3	3.0000	6.0000	18.0000	9.0000	36.0000
Totals			28.0000	14.0000	56.0000

‖A‖ = √14.0000 = 3.7417

‖B‖ = √56.0000 = 7.4833

cos = 28.0000 / (3.7417 × 7.4833) = 1.0000

Method: cos(A,B) = (A·B) / (‖A‖·‖B‖); distance = 1 − cos; angle = arccos(cos) — scikit-learn cosine_similarity and Manning, Raghavan & Schütze, Introduction to Information Retrieval, Ch. 6. Nothing leaves this page.

How it works

Cosine similaritymeasures the angle between two vectors while ignoring how long they are. Two vectors that point the same way score 1, vectors at right angles score 0, and vectors pointing in opposite directions score −1. The definition is the one used by scikit-learn and by Manning, Raghavan & Schütze's Introduction to Information Retrieval, Chapter 6.

For two equal-length vectors A = [a₁…aₙ] and B = [b₁…bₙ], the similarity is the dot product divided by the product of the two magnitudes:

cos(A, B) = (A · B) / (‖A‖ · ‖B‖) = Σ aᵢbᵢ / ( √(Σ aᵢ²) · √(Σ bᵢ²) )

The tool computes this in four steps:

Dot product. Multiply the vectors component by component and add the results: A · B = Σ aᵢbᵢ.
Magnitudes.Take the square root of each vector's sum of squares: ‖A‖ = √(Σ aᵢ²) and likewise for ‖B‖.
Divide. The similarity is (A·B) / (‖A‖·‖B‖). If either magnitude is zero the vector has no direction, so the result is undefined and the tool shows a clear message instead of a divide-by-zero.
Derive distance and angle. Cosine distance is 1 − similarity, and the angle is arccos(similarity) in degrees, clamping the input to [−1, 1] first to guard against floating-point drift.

Text mode applies the vector-space model from the same IR textbook. Each text is tokenised into words, a shared vocabulary is built from both texts, and every text becomes a raw term-frequency vector — one count per vocabulary word — over that shared vocabulary. Those two vectors are then fed into exactly the steps above. Because both texts are projected onto the same vocabulary, their vectors are always the same length. This version uses raw counts, not TF-IDF weighting, so the working stays transparent; for model-based semantic comparison, the embedding tools linked below go further. As a credibility check, the calculator also computes the similarity a second way — by normalising each vector to unit length and taking the dot product, which is how scikit-learn implements it internally — and confirms the two routes agree.

Worked examples

Parallel vectors — A = [1, 2, 3], B = [2, 4, 6] (B = 2·A)

Dot product: 1·2 + 2·4 + 3·6 = 2 + 8 + 18 = 28
‖A‖ = √(1 + 4 + 9) = √14 = 3.741657
‖B‖ = √(4 + 16 + 36) = √56 = 7.483315
cos = 28 / (3.741657 × 7.483315) = 28 / 28 = 1.0000
distance = 0.0000, angle = arccos(1) = 0.00° → Identical direction

Orthogonal vectors — A = [1, 0], B = [0, 1]

Dot product: 1·0 + 0·1 = 0
‖A‖ = √1 = 1, ‖B‖ = √1 = 1
cos = 0 / (1 × 1) = 0.0000
distance = 1.0000, angle = arccos(0) = 90.00°
No shared direction → Unrelated / orthogonal

Text mode — “the cat sat” vs “the cat ran” (case-insensitive)

Vocabulary = [the, cat, sat, ran]
TF A = [1, 1, 1, 0], TF B = [1, 1, 0, 1]
Dot product: 1·1 + 1·1 + 1·0 + 0·1 = 2
‖A‖ = √3 = 1.732051, ‖B‖ = √3 = 1.732051
cos = 2 / 3 = 0.6667, angle = arccos(2/3) = 48.19° → Moderately similar

Frequently asked questions

Sources & references

The formulas on this page were last cross-checked against these sources on 2026-06-10. Cosine similarity is a stable mathematical definition, so this tool needs no rate or schedule updates — only the worked examples are periodically re-reconciled.

Related tools

LiveAI

Euclidean Distance Calc

Compute the Euclidean (L2) distance between two points or two numeric vectors of any dimension, with the full per-dimension working. Also shows the squared Euclidean, Manhattan (L1), and Chebyshev (L∞) distances, and matches scikit-learn and NumPy — entirely in your browser.

Open tool

LiveAI

Levenshtein Distance Calc

Computes the Levenshtein edit distance between two strings with a full Wagner–Fischer DP matrix and step-by-step edit operations, entirely in the browser.

Open tool

LiveAI

Hamming Distance Calc

Compute the Hamming distance between two equal-length binary strings, text strings, or numeric vectors — the count of positions that differ. Also shows the normalised Hamming loss, similarity, and exactly which positions mismatch, entirely in your browser.

Open tool

Rate this tool

Be the first to rate

Comments & feedback

Spotted a bug or want an improvement? Tell us — our team reviews every comment, and good ideas get built. Comments are public and anonymous.

Found a bug, edge case, or want to suggest an improvement?

Email me at [email protected] — most fixes ship within 24 hours.