Why is this fuzz.ratio giving me 25 when none of the characters match?

Question

I'm trying to work through how fuzzywuzzy calculates this simple fuzz ratio:

print(fuzz.ratio("66155347", "12026599"))
25

Why is the fuzz ratio not 0 since they are completely different characters in every position?

The Levenshtein Distance = 8 (because every value needs to be substituted) a is 8 (length of string 1 is 8) b is 8 (length of string 2 is 8)

fuzz.ratio is (a+b - Levenshtein Distance)/(a+b)

fuzz.ratio is (8+8 - 8)/(8+8) = .50

fuzz.ratio is 50

There also must be something wrong with my math; I'm getting 50.

How does the fuzz ratio arrive at 25?

Any guidance would be appreciated.

Thanks

The [source code](https://github.com/ztane/python-Levenshtein/blob/811c050ab71593879804a61347352764837d000f/Levenshtein/_levenshtein.c#L760) for `ratio()` is available if you want to see for yourself what's calculating the ratio. The fuzzywuzzy library just multiplies the result by 100 according to its source code. — Random Davis, Oct 07 '20 at 20:25

score 4 · Accepted Answer · answered Oct 07 '20 at 20:27

4

The fuzzywuzzy library uses a weighted version of the Levenshtein distance which gives a weight of 2 to replacements, which brings the Levenshtein distance up to 12. Then (8 + 8 - 12) / (8 + 8) = 0.25.

answered Oct 07 '20 at 20:27

Johannes Riecken

2,301
16
17

Thanks for the reply. So if all characters are being replaced, how is that not 16 instead of 12 since there are 8 characters? – nopaynenogain Oct 07 '20 at 20:33
1

Because a smaller Levenshtein distance can be achieved by inserting and deleting less than all the characters. Both strings contain a "6" and somewhere after it a "5" for example. If both strings contained completely unique characters like "01234567" and "abcdefgh", then the fuzz ratio would indeed be 0. – Johannes Riecken Oct 07 '20 at 20:51
Thanks for the additional context! I just tested this out with different numbers and got 0. – nopaynenogain Oct 07 '20 at 20:55

Why is this fuzz.ratio giving me 25 when none of the characters match?

1 Answers1