I want to find the fuzz.ratio of strings that are in two dataframes. Let's say I have 2 dataframes df with columns A, B and bt_df with columns A1, B1.. I want to compare the column df['B'] and bt_df['B1'] and return the best matching score and its corresponding id in df[A] and .
df
Out[8]:
A B
0 11111111111111111111 Cheesesalad
1 22222222222222222222 Cheese
2 33333333333333333333 salad
3 44444444444444444444 BMWSalad
4 55555555555555555555 BMW
5 66666666666666666666 Apple
6 77777777777777777777 Apple####
7 88888888888888888888 Macrooni!
bt_df
Out[9]:
A1 B1
0 180336 NaN
1 154263 Cheese
2 130876 Salad
3 204430 Macrooni
4 153490 NaN
5 48879 NaN
6 185495 NaN
7 105099 NaN
8 8645 Apple
9 54038 NaN
10 156523 NaN
11 18156 BWM
Hence the result should be:
B1 matchedstring score id
Cheese Cheese 100 22222222222222222222
.....
.....
Thanks in advance.