Levenshtein Distance — Compare Two Strings

Compute edit distance & similarity and see a highlighted visual diff. Private by design—everything runs locally.

Input & Options

A: 0 chars · B: 0 chars

Tips: Ctrl/Cmd + K focuses “A”. Ctrl/Cmd + Enter runs Compare.

Results

Distance:
Similarity:
Ops:
Lengths:
Deletion (A only) Insertion (B only) Substitution

A with changes highlighted

B with changes highlighted

Show aligned operations

    What is Levenshtein Distance?

    Levenshtein distance (also called edit distance) is the smallest number of single-character edits needed to turn one string into another. The allowed edits are insertion, deletion, and substitution. For example, turning kitten into sitting takes three edits (k→s, e→i, insert g), so the distance is 3.

    Why people use it

    • Spell checking & fuzzy search: find words that are “close enough” to a query.
    • Data cleaning: match near-duplicate names, addresses, or product titles.
    • Natural language & bioinformatics: compare tokens, lines, or genetic sequences.
    • UX matching: tolerate typos in forms and search boxes.

    Character vs. Word mode

    • Character mode: great for short strings, usernames, variable names, and typos.
    • Word mode: splits on whitespace and compares tokens, which is clearer for sentences and paragraphs. It answers “how many word-level changes are needed?”

    Similarity score

    We show a simple similarity percentage: 1 − (distance / max(length(A), length(B))). In word mode, “length” means word count.

    Case, Unicode, and language notes

    • Case sensitivity: toggle on/off. For case-insensitive matching, we compare lowercase forms but keep your original text for display.
    • Unicode: operates on code points. Some grapheme clusters (e.g., emoji + modifiers) may count as multiple edits.

    How it’s computed (light overview)

    We use dynamic programming over a (len(A)+1) × (len(B)+1) grid. The bottom-right cell is the distance; backtracking yields the sequence of edits we highlight.

    Related metrics

    • Damerau–Levenshtein: counts adjacent transpositions as one edit.
    • Jaro / Jaro–Winkler: emphasize matching characters and order.
    • Cosine / Jaccard on tokens: compare word sets or vectors for document similarity.

    Explore more tools