118

I need to divide all but the first columns in a DataFrame by the first column.

Here's what I'm doing, but I wonder if this isn't the "right" pandas way:

df = pd.DataFrame(np.random.rand(10,3), columns=list('ABC'))

df[['B', 'C']] = (df.T.iloc[1:] / df.T.iloc[0]).T

Is there a way to do something like df[['B','C']] / df['A']? (That just gives a 10x12 dataframe of nan.)

Also, after reading some similar questions on SO, I tried df['A'].div(df[['B', 'C']]) but that gives a broadcast error.

itzy
  • 11,275
  • 15
  • 63
  • 96

3 Answers3

180

I believe df[['B','C']].div(df.A, axis=0) and df.iloc[:,1:].div(df.A, axis=0) work.

Dthal
  • 3,216
  • 1
  • 16
  • 10
  • 3
    This works great, but the resulting data frame is missing other columns that are not used in the above division. For example, if there is a column D that is not divided by A, the resulting data frame is missing column D and the column A. Any way to overcome this? – NAGA Oct 11 '18 at 14:02
  • 1
    I think I found a way for my data set. I moved the column D to index using pd.set_index() and after the division got back to the data set. – NAGA Oct 11 '18 at 14:06
  • @NAGA, does your method of moving one column `D` to index work if you have multiple columns to ignore (ie to not divide through)? – alancalvitti Mar 04 '19 at 20:11
  • @alancalvitti it worked. I tried moving two of my columns to index using set_index() and then performed the division operation. I got back my columns using reset_index() without any problem. Also, one of my columns had duplicate values and it did not create any problem when I moved it to index. – NAGA Mar 04 '19 at 21:23
  • 18
    I believe you can also do `df.loc[:, cols] = df.loc[:, cols].div(df['A'], axis=0)`, to only divide a subset of the columns – Ken Syme May 21 '19 at 11:25
  • 15
    no need to move column D to index, just do this: `df[['B','C']] = df[['B','C']].div(df.A, axis=0)`. All other columns are preserved in the dataframe `df` – noleto Jul 05 '19 at 15:17
  • 4
    why does it not broadcast by default? – RichieV Sep 17 '20 at 10:36
9
do: df.iloc[:,1:] = df.iloc[:,1:].div(df.A, axis=0)

This will divide all columns other than the 1st column with the 'A' column used as divisor.

Results are 1st column + all columns after / 'divisor column'.

Hafenkranich
  • 1,696
  • 18
  • 32
user17300673
  • 91
  • 1
  • 1
2

You are actually doing a matrix multiplication (Apparently numpy understands that "/" operator multiplies by the inverse), so you need the shapes to match (see here).

e.g.

df['A'].shape --> (10,)
df[['B','C']].shape --> (10,2)

You should make them match as (2,10)(10,):
df[['B','C']].T.shape, df['A'].shape -->((2, 10), (10,))

But then your resulting matrix is: ( df[['B','C']].T / df['A'] ).shape --> (2,10)

Therefore:

( df[['B','C']].T / df['A'] ).T

Shape is (10,2). It gives you the results that you wanted!

aerijman
  • 2,522
  • 1
  • 22
  • 32