We have two dataframe
dataframe 1 ::
dataframe 2 :
need to validate same data in second dataset in combined column and add id column from first dataset
means output like ::
!pip install fuzzywuzzy
from fuzzywuzzy import fuzz
data = pd.read_csv(dataframe 1)
df = pd.read_csv(dataframe 2)
word = data['data'].tolist()
find = df['combined'].tolist()
df_final = pd.DataFrame(columns=['combined','id'])
for j in find:
j = str(j)
for i in word:
if i:
i = str(i)
Token_Sort_Ratio = fuzz.token_sort_ratio(j,i)
if Token_Sort_Ratio > 70:
#print(i)
final = data[data.data == i]
df1 = df[df.combined == j]
df_final['id']=df_final['id'].append(final['id'],ignore_index=True)
df_final['combined']= df_final['combined'].append(df1['combined'],ignore_index=True)
But data is not append in df_final dataset, kindly help me about this. after that we are planning to join df_final and dataframe 2 on combined column
please feel free to suggest, If you have any other solution apart from this