-2

hi please help me I am trying to fuzzy merge using pandas and fuzzywuzzy on two datasets using two columns from each, but I get a traceback at the line before the print function that says KeyError: ('name', 'lasntname'), I do not know if I am referencing wrong or what, I have tried the double brackets and parenthesis no luck

heres the code

import pandas as pd
from fuzzywuzzy import fuzz, process
from itertools import product

N = 80
names = {tup: fuzz.ratio(*tup) for tup in
     product(df1["Name"].tolist(), 
     df2["name"].tolist())}

     s1 = pd.Series(names)
     s1 = s1[s1 > N]
     s1 = s1[s1.groupby(level=0).idxmax()]

     surnames = {tup: fuzz.ratio(*tup) for tup in
        product(df1["Last_name"].tolist(), 
     df2["lasntname"].tolist())}

     s2 = pd.Series(surnames)
     s2 = s2[s2 > N]
     s2 = s2[s2.groupby(level=0).idxmax()]

     # map and fill nulls

     df2["name"] = 
     df2["name"].map(s1).fillna(df2["name"])
     df2["lasntname"] = 
     df2["lasntname"].map(s2).fillna(df2["lasntname"])

     df = df1.merge(df2, on=["name", "lasntname"], 
     how='outer')
     print(df)
Lamo
  • 11
  • 3
  • 1
    Please, create a minimal example, and not not post images of text. https://unix.meta.stackexchange.com/questions/4086/psa-please-dont-post-images-of-text – chrslg Oct 05 '22 at 12:39
  • 1
    In order for us to help you, it is necessary that you show your effort and submit data to be used to reproduce your problem. While providing an image is helpful, it doesn't allow for reproducing the issue. Please edit your question to show a minimal reproducible set. See [Minimal Reproducible Example](https://stackoverflow.com/help/minimal-reproducible-example "Minimal Reproducible Example") for details. – itprorh66 Oct 05 '22 at 13:55
  • df1 and df2 have not been defined and the code has indentation errors as well – HackerBoss Oct 05 '22 at 14:43

1 Answers1

0

Hi Just make your Column names uniform on both tables should work