I am looking for a way to output the match percentage while between two strings (ex: names) while also taking into consideration they might be the same but with the words in a different order. I tried using SequenceMatcher() but the results are only partialy satisfying:
a = "john doe"
b = "jon doe"
c = "doe john"
d = "jon d"
e = 'john do'
s = SequenceMatcher(None, a, b)
s.ratio()
0.9333333333333333
s = SequenceMatcher(None, a, c)
s.ratio()
0.5
s = SequenceMatcher(None, a, d)
s.ratio()
0.7692307692307693
s = SequenceMatcher(None, a, e)
s.ratio()
0.9333333333333333
I am ok with all but the second result. I notice that it does not take into consideration that c is contains the same words as a but in a different order.
Is there any other way to match strings and obtain a higher matching percentage in the case I mentioned above. It should also be taken into consideration that names may contain more than two words.
Thank you!