0

I have a dataframe as follows:

     A    B    C   cap
0  482  959   67  1000
1   79   45    2   100
2  855  164  173  1000
3    5    0    1    10
4  659  831  899  1000

Each number is generated by randomizing an int between 0 and df['cap'] for example: in row 0, I generate 3 random numbers between 0-1000 in row 1, I generate 3 random numbers between 0-100 in row 2, I generate 3 random numbers between 0-1000 in row 3, I generate 3 random numbers between 0-10 in row 4, I generate 3 random numbers between 0-1000

I want to get this dataframe:

      A       B       C  
 0    0.482   0.959   0.067
 1    0.790   0.450   0.020
 2    0.855   0.164   0.173
 3    0.500   0.000   0.100
 4    0.659   0.831   0.899

(don't mind the number of digits after the decimal point) I tried:

df['A'] / df['cap'] 

worked fine for a single column. but

df[['A','B']] / df['cap'] 

got index error. Also most other tricks I've tried. how do I normalize 'A' 'B' and 'C' by 'cap'?

Guy Barash
  • 470
  • 5
  • 17

4 Answers4

1

You can use .div to control the axis that division occurs on better than simply using the division operator /:

normalized_df = df.loc[:, "A":"C"].div(df["cap"], axis=0)

print(normalized_df)
       A      B      C
0  0.482  0.959  0.067
1  0.790  0.450  0.020
2  0.855  0.164  0.173
3  0.500  0.000  0.100
4  0.659  0.831  0.899
Cameron Riddell
  • 10,942
  • 9
  • 19
1

You can drop the column you don't need (temporarily) and perform division on everything else.

df.drop(labels="cap", axis=1).div(df.cap, axis=0)

       A      B      C
0  0.482  0.959  0.067
1  0.790  0.450  0.020
2  0.855  0.164  0.173
3  0.500  0.000  0.100
4  0.659  0.831  0.899
gold_cy
  • 13,648
  • 3
  • 23
  • 45
1

try this:

value = df['cap'].values.reshape((5,1))
new_df = df / value

Also, after you get new_df, you can drop cap if you want.

Hope this is helpful!

vae
  • 132
  • 6
0

You can use apply function, try this:

df[['A', 'B', 'C']].apply(lambda x: x / df.cap)