7

I have 2 dataframes. I would like to broadcast a divide operation

df1= pd.DataFrame([[1.,2.,3.,4.], [5.,6.,7.,8.], [9.,10.,11.,12.]],
                  columns=['A','B','C','D'], index=['x','y','z'])

df2= pd.DataFrame([[0.,1.,2.,3.]], columns=['A','B','D','C'], index=['q'])

Notice that the columns are aligned slightly differently in df2.

I would like to divide df1 by df2 where the row is broadcast but the column labels are respected.

   A   B   C   D
x  1   2   3   4
y  5   6   7   8
z  9  10  11  12


   A  B  D  C
q  0  1  2  3

This would be wrong.

df1.values/df2.values

[[         inf   2.           1.5          1.33333333]
 [         inf   6.           3.5          2.66666667]
 [         inf  10.           5.5          4.        ]]

Answer I desire is:

   A    B   C      D
x  inf  2   1      2
y  inf  6   2.33   4
z  inf  10  3.66   6
Mike Müller
  • 82,630
  • 20
  • 166
  • 161
Dickster
  • 2,969
  • 3
  • 23
  • 29

3 Answers3

5

If you divide by a Series (by selecting that one row of the second dataframe), pandas will align this series on the columns of the first dataframe, giving the desired result:

In [75]: df1 / df2.loc['q']
Out[75]:
     A   B         C  D
x  inf   2  1.000000  2
y  inf   6  2.333333  4
z  inf  10  3.666667  6

If you don't know/want to use the name of that one row, you can use squeeze to convert the one-column dataframe to a series: df1 / df2.squeeze() (see answer of @EdChum).

joris
  • 133,120
  • 36
  • 247
  • 202
1

May be, you could order your df2 columns same of df1 and then divide on values

In [53]: df1.values/df2[df1.columns].values
Out[53]:
array([[         inf,   2.        ,   1.        ,   2.        ],
       [         inf,   6.        ,   2.33333333,   4.        ],
       [         inf,  10.        ,   3.66666667,   6.        ]])
Zero
  • 74,117
  • 18
  • 147
  • 154
1

You can reorder the column and then call squeeze to flatten the array and then call div:

In [114]:

df1= pd.DataFrame( [[1.,2.,3.,4.],[5.,6.,7.,8.],[9.,10.,11.,12.]] ,columns = ['A','B','C','D'], index = ['x','y','z'])
df2= pd.DataFrame( [[0.,1.,2.,3.]] ,columns = ['A','B','D','C'], index = ['q'])    ​
df1.div(df2.ix[:,df1.columns].squeeze())

Out[114]:
     A   B         C  D
x  inf   2  1.000000  2
y  inf   6  2.333333  4
z  inf  10  3.666667  6

df1/df2.ix[:,df1.columns].squeeze() also works but @Joris's answer is much nicer

EDIT

As pointed out by @joris the column reordering is unnecessary as pandas will naturally align against the columns anyway so:

df1.div(df2squeeze())

or

df1./df2squeeze()

would work

EdChum
  • 376,765
  • 198
  • 813
  • 562
  • 1
    @EdChum you don't need the `.ix[:,df1.columns]` part when using `div`, as this aligns the indices automatically: just `df1.div(df2.squeeze())` works as well (which is also a nice solution!) – joris May 19 '15 at 13:53
  • 1
    @joris yeah was overthinking this, I like fancy indexing a bit too much, will update, thanks – EdChum May 19 '15 at 13:55