I have two data frames, df1 (8852 X 16) and df2 (302 X 4). In df2 the columns are poc_fristname, poc_lastname, phone, and email. in df1 it has all of these columns plus many other columns to it.
I am trying to match multiple values from df2 to df1 and grab the third column in df1 so I can do some insights. I have been attempting isin()
and also have been attempting .merge
but only seem to grab a couple of values or none that match. I believe this is due to different names being used, different phone numbers being used and different emails.
So in order to catch the third value column, I am looking at poc_lastname, email, and phone unique identifiers.
example dataset:
df1
affiliate_code poc_firstname poc_lastname business_name phone email
AFF0007070 Jamel Zright Jamel Zright 5555555555 jamel@zr.co
BAM0000001 Brandon Elison Michael Brandon Elison 6666666666 bellison@sygrp.com
BGA2631157 Tarry Zimmerman Tarry Jon Zimmerman 7777777777 barryz@ba.com
CGD18880697 Crena Heman Crena Heman NaN NaN
CGD8401802 Stevie Genciik Stevie Genciik 8888888888 info@edirect.com
df2
poc_firstname poc_lastname phone email
0 Aron Lai 6.789793e+09 aron.lai1@icloud.com
1 Aron Sweat 4.695152e+09 aronts@gmail.com
2 Aaron Harwod 5.072724e+09 aaron@arh.com
3 Abby Bedient 3.055190e+09 abby@i.com
4 Adam Howard 2.484597e+09 ahoward@gmail.com
expected output:
affiliate_code poc_lastname phone email
0 BGA2631157 Sweat 4.695152e+09 aaron.lane1@icloud.com
1 CGD8401802 Bedient 3.055190e+09 aarontsweet@gmail.com
Code tried:
df1.merge(df2, left_on='email', right_on='email')[['affiliate_code']]
df3 = df2[df2['phone'].isin(df1['phone'])]
Each attempt gets some of the matching values or none at all.
If any clarification is needed, please let me know. Working on providing clear questions.