induwara.lk
induwara.lkText · Utility

Remove Duplicate Lines from Any Text

Paste a list, log, or any block of text and strip out repeated lines in one click. Order-preserving by default, with toggles for case-insensitive comparison, whitespace trimming, and keep-first or keep-last selection. Every keystroke processed locally in your browser.

By Induwara AshinsanaUpdated May 11, 2026
Remove duplicate linesOrder-preserving, Unicode-safe
Runs entirely in your browser. Nothing is uploaded, logged, or stored.

Options

Duplicates removed
0
Unique lines kept
0
Input lines
0
Duplicate groups
0
0 repeated lines

Output length: 0 characters · no duplicates found.

Sources: line splitting follows the WHATWG line-terminator definitions (LF, CR, CRLF). Case folding uses ECMA-262 String.prototype.toLowerCase (Unicode simple case folding). The output preserves the input's predominant line ending. Full citations in the Sources section below.

How it works

The tool is built around three standard-library primitives: String.prototype.split to break the input into lines, Set to remember which canonical keys have already been seen, and Array.prototype.join to stitch the kept lines back together. Nothing else is on the critical path — no third-party library, no network call, no server.

  1. Split. The input is split on the regular expression /\r?\n/, which accepts every line terminator the web platform recognises — LF (Unix/macOS), CRLF (Windows), and any mix of the two. A single trailing newline is detected and stripped before splitting so a file like "a\nb\n" is reported as two lines, not three.
  2. Canonicalise. Each line is turned into a comparison key. If the Trim toggle is on, leading and trailing whitespace are removed from the key. If the Case-sensitive toggle is off, the key is lower-cased using ECMA-262 String.prototype.toLowerCase (Unicode simple case folding). The original line text is kept unchanged — only the key used for comparison is transformed.
  3. Walk. The lines are scanned once in the chosen direction (left-to-right for Keep first, right-to-left for Keep last). For each line the key is looked up in a Set; if absent, the key is added and the original line is recorded as kept; if present, the line is dropped. With Keep blanks enabled, blank lines short-circuit this check so they never participate in the set. Average complexity is O(n) — a two-million-character paste finishes in tens of milliseconds.
  4. Re-emit.The kept lines are joined with the input's predominant line terminator (CRLF if any CRLF was present in the original, otherwise LF), and a single trailing newline is re-added if the input had one. The output is byte-identical to your input when no duplicates are found, which keeps diffs minimal for round-trip editing.
  5. Cross-check. A second algorithm — a frequency Map that counts each canonical key — computes the removed-count independently. The Verified badge stays green only when both algorithms agree. Two paths agreeing on every input is a strong signal that the output you see is correct.

One deliberate non-feature: the tool does notapply Unicode normalisation (NFC) before comparing. If you paste the letter "é" once as the precomposed code point U+00E9 and once as the decomposed pair U+0065 U+0301, the lines remain distinct. That mirrors what every text editor shows you and avoids silently collapsing intentionally-different encodings — which would be the wrong default for source code and data processing tasks.

One thing that isUnicode-aware: case folding works across scripts. Comparing "Café" against "café" with case-insensitive mode on correctly treats them as duplicates because toLowerCase handles the accented letter the same way it handles ASCII.

Worked examples

Plain dedupe, keep-first

input = ["apple", "banana", "apple", "cherry", "banana"]

  1. Split on /\r?\n/ → 5 lines
  2. Walk left-to-right with a Set
  3. Pos 0 'apple' → keep, set = {apple}
  4. Pos 1 'banana' → keep, set = {apple, banana}
  5. Pos 2 'apple' → already seen → drop
  6. Pos 3 'cherry' → keep, set = {apple, banana, cherry}
  7. Pos 4 'banana' → already seen → drop
  8. Output = ['apple', 'banana', 'cherry'], removed = 2

Case-insensitive with whitespace trim

input = [" Apple ", "apple", "APPLE", "Banana"], options = caseSensitive:off, trim:on

  1. Canonical keys after trim+lowercase = ['apple', 'apple', 'apple', 'banana']
  2. Pos 0 → key 'apple' new → keep (original spacing intact)
  3. Pos 1 → key 'apple' seen → drop
  4. Pos 2 → key 'apple' seen → drop
  5. Pos 3 → key 'banana' new → keep
  6. Output = [' Apple ', 'Banana'], removed = 2
  7. Note: the kept line keeps its original case and spacing.

Keep blanks (preserve paragraph breaks)

input = "a\\n\\nb\\n\\nc\\n\\n" with keepBlanks = on

  1. Split + trailing-newline strip → 6 lines: ['a','','b','','c','']
  2. Blank-line short-circuit: each blank passes through untouched
  3. Non-blanks 'a', 'b', 'c' are all unique → all kept
  4. Output = byte-identical to input, removed = 0
  5. With keepBlanks = off (default) the blanks dedupe too → output 'a\n\nb\nc\n', removed = 2

CRLF input — line endings preserved

input = "line1\\r\\nline2\\r\\nline1\\r\\nline3"

  1. Split tolerates \r\n and \n → 4 lines
  2. Detect: input contains CRLF → joiner = '\r\n'
  3. Keep-first walk drops the duplicate 'line1' at index 2
  4. Output = 'line1\r\nline2\r\nline3' (CRLF preserved)
  5. removed = 1, unique = 3

Frequently asked questions

Sources & references

Related tools

Rate this tool
Be the first to rate

Comments & feedback

Spotted a bug or want an improvement? Tell us — our team reviews every comment, and good ideas get built. Comments are public and anonymous.

Found an edge case, an unexpected count, or want a new toggle?

Email me at [email protected] — most fixes ship within 24 hours.