What is the Hamming distance between two strings?

It is the number of positions at which the two strings have different symbols, counted over equal-length strings. For example, "karolin" and "kathrin" differ at positions 3, 4, and 5, so their Hamming distance is 3. Each differing position adds exactly 1; matching positions add nothing.

How do you calculate Hamming distance by hand?

Line the two equal-length inputs up character by character, then walk left to right comparing each pair. Put a mark under every column where the symbols differ, and count the marks. That total is the Hamming distance. There is no weighting and no rounding — it is a plain count of mismatched positions.

What is the difference between Hamming distance and Levenshtein distance?

Hamming distance only allows substitutions and needs both inputs to be the same length, so it counts mismatched positions. Levenshtein (edit) distance also allows insertions and deletions, so it handles inputs of different lengths and is usually smaller or equal. Use Hamming for fixed-length codewords or vectors, and Levenshtein for free-form text of differing length.

Can Hamming distance be used on strings of different lengths?

No. The Hamming distance is only defined for equal-length sequences (Hamming, 1950), so this tool returns an error instead of silently padding or truncating. If your inputs are different lengths, use an edit-distance metric such as Levenshtein distance, which is linked in the related tools below.

What is normalized Hamming distance (Hamming loss)?

The normalized Hamming distance, also called Hamming loss, is the raw distance divided by the length — the fraction of positions that differ, a value between 0 and 1. For "1011101" vs "1001001" it is 2 ÷ 7 ≈ 0.2857. This is the definition used by scikit-learn's hamming_loss and SciPy's distance.hamming.

What is Hamming similarity?

Hamming similarity is 1 minus the normalized Hamming distance, shown here as a percentage. If two 7-bit codewords differ in 2 positions, the normalized distance is 2/7 ≈ 0.2857 and the similarity is about 71.43%. It answers "what fraction of positions agree?" rather than "how many differ?".

Where is Hamming distance used?

It underpins error-detecting and error-correcting codes (a code that detects up to k errors needs a minimum Hamming distance of k+1), measures the avalanche effect in cryptographic hashes, compares binary feature vectors and image hashes, and serves as a loss for multi-label classification. It is one of the oldest and most widely used distance metrics in computing.

Does this calculator handle binary, text, and numeric inputs?

Yes. Binary mode treats each 0/1 character as a position and rejects any other digit. Text mode compares Unicode characters and offers case-insensitive folding. Numeric-vector mode splits on a comma or space and compares the parsed numbers, so 2 and 2.0 count as equal. In every mode the two inputs must have the same length.

Strings · Coding theory

Hamming Distance Calculator

Find the Hamming distance between two equal-length inputs — binary codewords, text strings, or numeric vectors — as the count of positions that differ. See the normalised Hamming loss, the similarity, and exactly which positions mismatch. Runs entirely in your browser, no signup.

By Induwara Ashinsana— Executive Director, Ryzera TechnologiesUpdated Jun 10, 2026

Hamming distance

Cross-checked ✓

Input A7 / 5,000

Input B7 / 5,000

Examples

Hamming distance

out of 7 bits

Normalised (loss)

0.2857

distance ÷ length

Similarity

71.43%

1 − normalised

Length

bits per input

Position-by-position

011^

100^

210^

311^

410^

500^

611^

Differing positions

0-indexed: 2, 4

1-indexed: 3, 5

Sources cited: Hamming (1950), Error detecting and error correcting codes; scikit-learn hamming_loss; SciPy distance.hamming. Full list with links in the references section below.

How it works

The Hamming distance between two sequences A and B of equal length n is the number of positions at which their symbols differ. It was introduced by Richard Hamming in his 1950 paper on error-detecting and error-correcting codes, and it is one of the building blocks of coding theory: a block code whose codewords are all at least a Hamming distance d apart can detect up to d − 1 errors and correct up to ⌊(d − 1) / 2⌋.

Parse each input into a list of symbols. Binary mode keeps each 0/1 character (and rejects anything else); text mode keeps each Unicode character, optionally lower-cased; vector mode splits on a comma or space and parses each element as a number, so 2 and 2.0 compare equal.
Length guard. If the two inputs parse to different lengths, the distance is undefined, so the tool stops and shows an error rather than padding the shorter one.
Count the mismatches with the core formula, where [A_i ≠ B_i] is 1 when the symbols at position i differ and 0 otherwise:
```
d_H(A, B) = Σ  [ A_i ≠ B_i ]   for i = 0 … n−1
```
Normalise. The normalised Hamming distance, or Hamming loss, is d_H / n— the fraction of positions that differ, between 0 and 1. This matches scikit-learn's hamming_loss and SciPy's distance.hamming.
Similarity is 1 − d_H / n, shown as a percentage — the share of positions that agree. When both inputs are empty the loss is defined as 0 and similarity as 100%, avoiding a division by zero.

The integer distance is exact with no rounding; only the normalised value and similarity are shown to a fixed number of decimals for display. Every result is independently cross-checked against a second, reduce-based implementation of the same count — if the two ever disagreed, the badge in the tool would flag it.

Worked examples

Binary codewords: 1011101 vs 1001001

distance = 2 · normalised ≈ 0.2857 · similarity ≈ 71.43%

Align the 7-bit words: 1 0 1 1 1 0 1 vs 1 0 0 1 0 0 1
Position 0: 1 = 1 .......... match
Position 1: 0 = 0 .......... match
Position 2: 1 ≠ 0 .......... differ ✗
Position 3: 1 = 1 .......... match
Position 4: 1 ≠ 0 .......... differ ✗
Positions 5–6: 0 = 0, 1 = 1 match
2 differing positions → d = 2. Loss = 2/7 ≈ 0.2857, similarity = (1 − 2/7) ≈ 71.43%

Text strings: karolin vs kathrin

distance = 3 · normalised ≈ 0.4286 · similarity ≈ 57.14%

Align the 7 characters: k a r o l i n vs k a t h r i n
Positions 0–1: k = k, a = a match
Position 2: r ≠ t .......... differ ✗
Position 3: o ≠ h .......... differ ✗
Position 4: l ≠ r .......... differ ✗
Positions 5–6: i = i, n = n match
3 differing positions → d = 3 (the textbook value). Loss = 3/7 ≈ 0.4286

Unequal length (edge case): 101 vs 1011

distance = undefined · the length guard fires

Input A has 3 bits; Input B has 4 bits.
Hamming distance is only defined for equal-length inputs (Hamming, 1950).
The tool stops and shows a clear error instead of padding B to length 3.
For different-length comparison, use the Levenshtein distance calculator instead.

Frequently asked questions

Sources & references

The definition, formula, and worked examples on this page were last cross-checked against these sources on 2026-06-10. Every distance is deterministic and verified against an independent second implementation on each calculation.

Related tools

LiveAI

Levenshtein Distance Calc

Computes the Levenshtein edit distance between two strings with a full Wagner–Fischer DP matrix and step-by-step edit operations, entirely in the browser.

Open tool

LiveAI

Euclidean Distance Calc

Compute the Euclidean (L2) distance between two points or two numeric vectors of any dimension, with the full per-dimension working. Also shows the squared Euclidean, Manhattan (L1), and Chebyshev (L∞) distances, and matches scikit-learn and NumPy — entirely in your browser.

Open tool

LiveAI

Cosine Similarity Calc

Compute the cosine similarity, cosine distance, and angle between two numeric vectors or two short texts, with the full dot-product and magnitude working. Matches scikit-learn, runs entirely in the browser.

Open tool

Rate this tool

Be the first to rate

Comments & feedback

Spotted a bug or want an improvement? Tell us — our team reviews every comment, and good ideas get built. Comments are public and anonymous.

Found a bug, edge case, or want to suggest an improvement?

Email me at [email protected] — most fixes ship within 24 hours.