I have a list of strings that I am trying to match to values in a column. If it is a low match (below 95) I want to return the current column value if it is above 95 then I want to return the best fuzzy match from the list . I am trying to put all returned values into a new column. I keep getting the error "tuple index out of range", I think this maybe because it wants to return a tuple with the score and name but I only want the name. Here is my current code:
from fuzzywuzzy import process
from fuzzywuzzy import fuzz
L = [ducks, frogs, doggies]
df
FOO PETS
a duckz
b frags
c doggies
def fuzz_m(column, pet_list, score_t):
for c in column:
new_name, score = process.extractOne(c, pet_list, score_t)
if score<95:
return c
else:
return new_name
df['NEW_PETS'] = fuzz_m(df,L, fuzz.ratio)
Desired output:
FOO PETS NEW_PETS
a duckz ducks
b frags frogs
c doggies doggies