Divide multiple columns by another column in pandas

Question

I need to divide all but the first columns in a DataFrame by the first column.

Here's what I'm doing, but I wonder if this isn't the "right" pandas way:

df = pd.DataFrame(np.random.rand(10,3), columns=list('ABC'))

df[['B', 'C']] = (df.T.iloc[1:] / df.T.iloc[0]).T

Is there a way to do something like df[['B','C']] / df['A']? (That just gives a 10x12 dataframe of nan.)

Also, after reading some similar questions on SO, I tried df['A'].div(df[['B', 'C']]) but that gives a broadcast error.

score 180 · Accepted Answer · answered Dec 31 '15 at 03:31

180

I believe df[['B','C']].div(df.A, axis=0) and df.iloc[:,1:].div(df.A, axis=0) work.

answered Dec 31 '15 at 03:31

Dthal

3,216
1
16
10

3

This works great, but the resulting data frame is missing other columns that are not used in the above division. For example, if there is a column D that is not divided by A, the resulting data frame is missing column D and the column A. Any way to overcome this? – NAGA Oct 11 '18 at 14:02
1

I think I found a way for my data set. I moved the column D to index using pd.set_index() and after the division got back to the data set. – NAGA Oct 11 '18 at 14:06
@NAGA, does your method of moving one column `D` to index work if you have multiple columns to ignore (ie to not divide through)? – alancalvitti Mar 04 '19 at 20:11
@alancalvitti it worked. I tried moving two of my columns to index using set_index() and then performed the division operation. I got back my columns using reset_index() without any problem. Also, one of my columns had duplicate values and it did not create any problem when I moved it to index. – NAGA Mar 04 '19 at 21:23
18

I believe you can also do `df.loc[:, cols] = df.loc[:, cols].div(df['A'], axis=0)`, to only divide a subset of the columns – Ken Syme May 21 '19 at 11:25
15

no need to move column D to index, just do this: `df[['B','C']] = df[['B','C']].div(df.A, axis=0)`. All other columns are preserved in the dataframe `df` – noleto Jul 05 '19 at 15:17
4

why does it not broadcast by default? – RichieV Sep 17 '20 at 10:36

score 9 · Answer 2 · edited Nov 02 '21 at 02:31

9

do: df.iloc[:,1:] = df.iloc[:,1:].div(df.A, axis=0)

This will divide all columns other than the 1st column with the 'A' column used as divisor.

Results are 1st column + all columns after / 'divisor column'.

edited Nov 02 '21 at 02:31

Hafenkranich

1,696
18
32

answered Nov 01 '21 at 14:03

user17300673

91
1
1

aerijman · Answer 3 · 2022-01-12T18:54:43.663

You are actually doing a matrix multiplication (Apparently numpy understands that "/" operator multiplies by the inverse), so you need the shapes to match (see here).

e.g.

df['A'].shape --> (10,)
df[['B','C']].shape --> (10,2)

You should make them match as (2,10)(10,):
df[['B','C']].T.shape, df['A'].shape -->((2, 10), (10,))

But then your resulting matrix is: ( df[['B','C']].T / df['A'] ).shape --> (2,10)

Therefore:

( df[['B','C']].T / df['A'] ).T

Shape is (10,2). It gives you the results that you wanted!

Divide multiple columns by another column in pandas

3 Answers3

Linked

Related