1

I have two dataframes

df1
>>>
   a  b
0  1  4
1  2  5
2  3  6

df2
>>>
   a  c
0  9  8
1  0  1
2  0  0

What I would like to do is create the non-overlapping columns in each dataframe and fill them with zero's. So, because df1 does not contain column c from df2, and df2 does not contain column b from df1, the end result would be

df1
>>>
   a  b  c
0  1  4  0
1  2  5  0
2  3  6  0

df2
>>>
   a  c  b
0  9  8  0
1  0  1  0
2  0  0  0

What is the best way to achieve this?

I have tried using reindex and reindex_like, but the former only seems to reindex rows rather than columns and I don't know what argument to use to reindex the columns

df1.reindex(df2.columns, fill_value=0.)
>>>
     a    b
a  0.0  0.0
c  0.0  0.0
PyRsquared
  • 6,970
  • 11
  • 50
  • 86

1 Answers1

1

Use DataFrame.align:

df1, df2 = df1.align(df2, fill_value=0)
print (df1)
print (df2)
   a  b  c
0  1  4  0
1  2  5  0
2  3  6  0
   a  b  c
0  9  0  8
1  0  0  1
2  0  0  0

Your solution with Index.union and DataFrame.reindex with axis=1 for columns:

c = df1.columns.union(df2.columns)
df1, df2 = df1.reindex(c, fill_value=0, axis=1), df2.reindex(c, fill_value=0, axis=1)
print (df1)
print (df2)
   a  b  c
0  1  4  0
1  2  5  0
2  3  6  0
   a  b  c
0  9  0  8
1  0  0  1
2  0  0  0
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252