2

Suppose I have the two following data frames df_1 and df_2 and want to merge them with an additional column which states if the row is present in df_1,df_2 or both. Is there any way I could do this directly while using merge?

df_1

Data Frame 1

df_2

enter image description here

The data frame I would like to have:

enter image description here

hegdep
  • 596
  • 1
  • 6
  • 16

1 Answers1

4

Use parameter indicator in merge and for expected output add map by dict:

df = pd.merge(df1, df2, how='outer', indicator='origin')
print (df)
   Index   x  y      origin
0     20   5  8   left_only
1     35   4  7   left_only
2     55   3  2        both
3     60  12  1        both
4     22   4  3  right_only
5     30  12  6  right_only

d = {'left_only':'1', 'right_only':'2', 'both':'1 & 2'}
df['origin'] = df['origin'].map(d)
print (df)
   Index   x  y origin
0     20   5  8      1
1     35   4  7      1
2     55   3  2  1 & 2
3     60  12  1  1 & 2
4     22   4  3      2
5     30  12  6      2
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252