Get the list of matching token from Fuzzywuzzy

Question

I am using fuzzywuzzy token_set_ratio to match 2 strong. I want to know the tokens that were matching. Is there a function in fuzzywuzzy to do so?

String1="this is a banana tree" String2="there is banana tree next to my house"

the token_set_ratio in this case is : 85

the matching tokens would be banana,tree,is I want to this as list

I want the output to be [banana,tree,is]

Can you please provide example input and output? – Vinod Sawant Dec 10 '19 at 14:16 — Vinod Sawant, Dec 10 '19 at 14:16
@VinodSawant Hi I have updated the question – Sid Dec 10 '19 at 14:22 — Sid, Dec 10 '19 at 14:22

Vinod Sawant · Accepted Answer · 2019-12-11T03:24:20.967

Code :

import fuzzywuzzy
from fuzzywuzzy import process

s1 = "this is a banana tree" 
s2 = "there is banana tree next to my house"

onegram1 = s1.split()
onegram2 = s2.split()

dummy_list = []
for i in onegram1:  
    matches = fuzzywuzzy.process.extract(i,onegram2,scorer=fuzzywuzzy.fuzz.token_sort_ratio)
    for i,j in matches:
        if j > 85:
            dummy_list.append(i)

Output :

dummy_list   
Out[24]: ['is', 'banana', 'tree']

Get the list of matching token from Fuzzywuzzy

1 Answers1

Code :

Output :