I am trying to learn and implement fuzzy matching in python. I have two data sets which I load as data frames into pandas. Set 1 is the reference set. Set two is the set containing data to match with the reference names.
I loop through the set_1 items to search for corresponding entries in the reference, but I get an error. I need some help with the error.
Am I trying to structure the algorithm in a good way?
My attempt:
import pandas as pd
import fuzzywuzzy as fuzzy
from difflib import SequenceMatcher
set_1 = pd.read_csv("C:/Folder/file_1.csv")
set_2 = pd.read_csv("C:/Folder/file_2.csv")
query = set_1['name']
choices = set_2['name2']
for query in query:
match = fuzzy.extractOne(query,choises=choises,scorer=scorer,score_cutoff=cutoff)
I get the following error:
AttributeError: module 'fuzzywuzzy' has no attribute 'extractOne'