I am working on Jaro wrinkler similarity, and I am able to use between 2 columns, but how do I use it with 2 pairs of columns

Question

Example i have 4 column in my dataframe, i want to use jaro similarity for col: A,B vs col: C,D containing strings

Currently i am using it between 2 columns using

df.apply(lambda x: textdistance.jaro(x[A], x[C]),axis = 1))

Currently i was comparing with names

|A|C |result| |--| --- | --- | |Kevin| kenny |0.67| |Danny |Danny|1| |Aiofa |Avril|0.75| I have records over 100K in my dataframe

COLUMN A -contains strings of person name

COLUMN B -contains strings of city

COLUMN C -contains strings of person name (to compare with)

COLUMN D -contains strings of city (to compare with)

Expected Output |A|B|C|D |result| |--|--|---| --- | --- | |Kevin|London| kenny|Leeds |0.4| |Danny |Dublin|Danny|dublin|1| |Aiofa|Madrid |Avril|Male|0.65|

Please provide a [Minimal, Reproducible Example](https://stackoverflow.com/help/minimal-reproducible-example). Add the data sample as text, not as a picture. E.g. try `df.head().to_dict(orient='list')` and post in a block between triple backticks (```). Show both input *and* expected output. Also, show us what you have tried so far, and why your attempt isn't giving you the result that you expect. See: [Research Effort](https://meta.stackoverflow.com/questions/261592/how-much-research-effort-is-expected-of-stack-overflow-users). — ouroboros1, Aug 10 '22 at 22:04
It depends on the application, so for your purpose would it make sense to compare by concatenations strings in the column pairs? Meaning: `df.apply(lambda x: textdistance.jaro(x['A'] + x['B'], x['C'] + x['D']),axis = 1))` — DarrylG, Aug 10 '22 at 22:22
Hi DarrylG, Thank you so much that worked well , thats what I was looking for. — Kevin D, Aug 15 '22 at 10:47

score 0 · Answer 1 · answered Aug 15 '22 at 10:48

0

df.apply(lambda x: textdistance.jaro(x['A'] + x['B'], x['C'] + x['D']),axis = 1))

thank you DarrylG

answered Aug 15 '22 at 10:48

Kevin D

1
1

Make sure to use markdown code formatting! – benicamera Aug 19 '22 at 08:45

I am working on Jaro wrinkler similarity, and I am able to use between 2 columns, but how do I use it with 2 pairs of columns

1 Answers1