# Jaro winkler vs Levenshtein Distance

Different Approaches to Name Matching

# Fuzzy Name Matching

This is one of the most commonly used approach. The basic idea behind fuzzy match is to measure the edit distance between 2 strings. What does it take to convert from Source String A to Destination String B?

There are many approaches to fuzzy match. But the 2 most common ones are Jaro-Winkler distance and Levenshtein distance. Let us understand how each one of them work.

Here is the more formal definition of this algorithm from Wikipedia

The Jaro–Winkler distance is a string metric measuring an edit distance between two sequences.

The lower the Jaro–Winkler distance for two strings is, the more similar the strings are. The score is normalized such that 0 means an exact match and 1 means there is no similarity. The Jaro–Winkler similarity is the inversion, (1 − Jaro–Winkler distance).

The formula to calculate the Jaro-Winkler similarity is:

Let us understand this with an example.

Levenshtein Distance

Here is the formal definition of this algorithm from Wikipedia:

The Levenshtein distance is a string metric for measuring the difference between two sequences. Informally, the Levenshtein distance between two words is the minimum number of single-character edits (insertions, deletions or substitutions) required to change one word into the other.

The formula to calculate the Levenshtein is:

As always, formulas are complicated to interpret. Let us understand with an example.

Dynamic Programming Approach

The most common way of calculating the edit distance to convert from one string to another is by the dynamic programming approach. You can watch this video to understand how it works. Below is the table depicting how it works.