Merging 2 pandas tables and using them

Question

I have 2 panda tables, table A which is something like that:

Date  a    b   c    d   e
0
...
.
.
.
2n

and table B which has something like that

Date f  g  k  h   i   j
2
.
.
..
.
.
2n-3
.

the issue is that each table has totally different dates, and totally different columns, but I want to get a new table with 0 if it is a column which it didnt had before, and also a new binary column(lets cal it MARK) which marks if its from table A or B. something like this:

Date   MARK  a    b   c    d   e  f  g  e  h   i   j
0        1   .    .    .   .   .  0  0  0  0   0    0
.
2        0   0    0   0    0   0  .  .   .  .  .    .
.
.
.
.
.
2n-3        0   0    0   0    0   0  .  .   .  .  .    .
.
.
2n        1   .    .    .   .   .  0  0  0  0   0    0

And i need it to be sorted by the dates... please help me with this... Thanks

score 2 · Accepted Answer · answered Dec 27 '19 at 15:09

2

Add the 'Mark' column before :

df1['Mark'] = 0
df2['Mark'] = 1

Then pd.concat does exactly what you want to do :

df = pd.concat([df1, df2], axis = 0, sort= True)

Note : Make sure your dates are the index for both :

df1 = df1.set_index('date_column')
df2 = df2.set_index('date_column')

answered Dec 27 '19 at 15:09

Vincent

1,534
3
20
42

works good but give the values who were not exist before Nan instead of zero, any solution? – secret Dec 27 '19 at 15:28
I am sorry I did not understand the problem – Vincent Dec 27 '19 at 15:35
1

@secret you can just use `df.fillna(0)` – mjspier Dec 27 '19 at 15:40

score 1 · Answer 2 · answered Dec 27 '19 at 15:22

other option would be to use merge with the indicator feature:

df = df1.merge(df2, on='Date', how='outer', indicator='Mark')

This will indicate with left_only or right_only from which dataframe the data came. You can change it to 0 and 1 like this:

df['Mark'] = (df['Mark'] == 'right_only').astype(int)

Full example:

df1 = pd.DataFrame(
    data={'Date': pd.date_range('2019-01-01', '2019-01-03'), 
          'a': [1,1,1], 
          'b': [2,2,2]})
df2 = pd.DataFrame(
    data={'Date': pd.date_range('2019-01-11', '2019-01-13'), 
          'c': [3,3,3], 
          'd': [4,4,4]})
df = df1.merge(df2, on='Date', how='outer', indicator='Mark')
df['Mark'] = (df['Mark'] == 'right_only').astype(int)

Merging 2 pandas tables and using them

2 Answers2