0

I am using fuzzywuzzy token_set_ratio to match 2 strong. I want to know the tokens that were matching. Is there a function in fuzzywuzzy to do so?

String1="this is a banana tree" String2="there is banana tree next to my house"

the token_set_ratio in this case is : 85

the matching tokens would be banana,tree,is I want to this as list

I want the output to be [banana,tree,is]

Sid
  • 552
  • 6
  • 21

1 Answers1

1

Code :

import fuzzywuzzy
from fuzzywuzzy import process

s1 = "this is a banana tree" 
s2 = "there is banana tree next to my house"

onegram1 = s1.split()
onegram2 = s2.split()

dummy_list = []
for i in onegram1:  
    matches = fuzzywuzzy.process.extract(i,onegram2,scorer=fuzzywuzzy.fuzz.token_sort_ratio)
    for i,j in matches:
        if j > 85:
            dummy_list.append(i)

Output :

dummy_list   
Out[24]: ['is', 'banana', 'tree']
Vinod Sawant
  • 613
  • 2
  • 5
  • 14