pandas normalize rows by column

Question

I have a dataframe as follows:

     A    B    C   cap
0  482  959   67  1000
1   79   45    2   100
2  855  164  173  1000
3    5    0    1    10
4  659  831  899  1000

Each number is generated by randomizing an int between 0 and df['cap'] for example: in row 0, I generate 3 random numbers between 0-1000 in row 1, I generate 3 random numbers between 0-100 in row 2, I generate 3 random numbers between 0-1000 in row 3, I generate 3 random numbers between 0-10 in row 4, I generate 3 random numbers between 0-1000

I want to get this dataframe:

      A       B       C  
 0    0.482   0.959   0.067
 1    0.790   0.450   0.020
 2    0.855   0.164   0.173
 3    0.500   0.000   0.100
 4    0.659   0.831   0.899

(don't mind the number of digits after the decimal point) I tried:

df['A'] / df['cap']

worked fine for a single column. but

df[['A','B']] / df['cap']

got index error. Also most other tricks I've tried. how do I normalize 'A' 'B' and 'C' by 'cap'?

Please refer the below link. https://stackoverflow.com/questions/34540567/divide-multiple-columns-by-another-column-in-pandas — Sunil Kumar Choubey, Nov 02 '20 at 19:40

score 1 · Answer 1 · answered Nov 02 '20 at 19:10

You can use .div to control the axis that division occurs on better than simply using the division operator /:

normalized_df = df.loc[:, "A":"C"].div(df["cap"], axis=0)

print(normalized_df)
       A      B      C
0  0.482  0.959  0.067
1  0.790  0.450  0.020
2  0.855  0.164  0.173
3  0.500  0.000  0.100
4  0.659  0.831  0.899

score 1 · Accepted Answer · answered Nov 02 '20 at 19:11

You can drop the column you don't need (temporarily) and perform division on everything else.

df.drop(labels="cap", axis=1).div(df.cap, axis=0)

       A      B      C
0  0.482  0.959  0.067
1  0.790  0.450  0.020
2  0.855  0.164  0.173
3  0.500  0.000  0.100
4  0.659  0.831  0.899

score 1 · Answer 3 · answered Nov 02 '20 at 19:18

1

try this:

value = df['cap'].values.reshape((5,1))
new_df = df / value

Also, after you get new_df, you can drop cap if you want.

Hope this is helpful!

answered Nov 02 '20 at 19:18

vae

132
6

score 0 · Answer 4 · answered Nov 02 '20 at 19:15

0

You can use apply function, try this:

df[['A', 'B', 'C']].apply(lambda x: x / df.cap)

answered Nov 02 '20 at 19:15

Cristian Contrera

713
3
17

apply lambda is VERY slow in python, the dataframe i'm using can be rather big. – Guy Barash Nov 03 '20 at 07:43

pandas normalize rows by column

4 Answers4