0

I have a dataframe (df) that contains 30 000 rows

id     Name     Age
 1     Joey     22
 2     Anna     34
 3     Jon      33
 4     Amy      30
 5     Kay      22

And Another dataframe (df2) that contains same columns but with some Ids missing

id     Name     Age    Sport
       Jon      33     Tennis
 5     Kay      22     Football
      Joey      22     Basketball
 4     Amy      30     Running 
      Anna      42     Dancing

I want the missing IDs to appear in df2 with the correspondant name

df2:

 id     Name     Age    Sport
 3     Jon      33     Tennis
 5     Kay      22     Football
 1     Joey      22     Basketball
 4     Amy      30     Running 
 2    Anna      42     Dancing

Can someone help ? I am new to pandas and dataframe

2 Answers2

1

you can use .map with .fillna

df2['id'] = df2['id'].replace('',np.nan,regex=True)\
                     .fillna(df2['Name'].map(df1.set_index('Name')['id'])).astype(int)


print(df2)

   id  Name  Age       Sport
0   3   Jon   33      Tennis
1   5   Kay   22    Football
2   1  Joey   22  Basketball
3   4   Amy   30     Running
4   2  Anna   42     Dancing
Umar.H
  • 22,559
  • 7
  • 39
  • 74
  • @ShubhamSharma thanks! could I get your feedback on [this](https://stackoverflow.com/questions/66351756/pandas-melt-2-groups-of-columns/66352275#66352275) post? – Umar.H Feb 25 '21 at 14:29
1

First, join the two dataframes with pd.merge based on your keys. I suppose the keys are 'Name' and 'Age' in this case. Then replace the null id values in df2, using np.where and .isnull() to find the null values.

df3 = pd.merge(df2, df1, on=['name', 'age'], how='left')
df2['id'] = np.where(df3.id_x.isnull(), df3.id_y, df3.id_x).astype(int)

    id  name    age sport
0   1   Joey    22  Tennis
1   2   Anna    34  Football
2   3   Jon     33  Basketball
3   4   Amy     30  Running
4   5   Kay     22  Dancing