6

I'm confused as to the highlighted line. What exactly is this line doing. What does .div do? I tried to look through the documentation which said

"Floating division of dataframe and other, element-wise (binary operator truediv)"

I'm not exactly sure what this means. Any help would be appreciated!

enter image description here

bugsyb
  • 5,662
  • 7
  • 31
  • 47

3 Answers3

13

You can divide one dataframe by another and pandas will automagically aligned the index and columns and subsequently divide the appropriate values. EG df1 / df2

If you divide a dataframe by series, pandas automatically aligns the series index with the columns of the dataframe. It maybe that you want to align the index of the series with the index of the dataframe instead. If this is the case, then you will have to use the div method.

So instead of:

df / s

You use

df.div(s, axis=0)

Which says to align the index of s with the index of df then perform the division while broadcasting over the other dimension, in this case columns.

piRSquared
  • 285,575
  • 57
  • 475
  • 624
  • Note: another difference with `/` is that `div` allows you to provide a fill_value for missing data in one of the inputs https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.div.html – onofricamila Apr 21 '20 at 19:57
1

In the above example, what it is essentially doing is dividing pclass_xt on axis 0, by the array/series which pclass_xt.sum(0) has generated. In pclass_xt.sum(0), .sum is summing up values along the axis=1, which gives you the total of both survived and not survived along all the pclasses. Then, .div is simply dividing the entire dataframe along 0 axis with the sum generated i.e. a row is divided by the sum of that row.

mck
  • 40,932
  • 13
  • 35
  • 50
0
    import pandas as pd,numpy as np
    
    data={"A":np.arange(10),"B":np.random.randint(1,10,10),"C":np.random.random(10)}
    #print(data)
    df2=pd.DataFrame(data=data)
    print("DataFrame values:\n",df2)
    s1=pd.Series(np.arange(1,11))
    print("s1 series values:\n",s1)
    
    print("Result of Division:\n",df2.div(s1,axis=0))
    
    **#So here, How the div is working as mention below:-
    #df Row1/s1 Row1 -0/1 4/1 0.305/1
    #df Row2/s1 Row2 -1/2 9/2  0.821/2**
    
#################Output###########################
DataFrame values:
    A  B         C
0  0  2  0.265396
1  1  2  0.055646
2  2  7  0.963006
3  3  9  0.958677
4  4  6  0.256558
5  5  6  0.859066
6  6  8  0.818831
7  7  4  0.656055
8  8  6  0.885797
9  9  4  0.412497
s1 series values:
 0     1
1     2
2     3
3     4
4     5
5     6
6     7
7     8
8     9
9    10
dtype: int64
Result of Division:
           A         B         C
0  0.000000  2.000000  0.265396
1  0.500000  1.000000  0.027823
2  0.666667  2.333333  0.321002
3  0.750000  2.250000  0.239669
4  0.800000  1.200000  0.051312
5  0.833333  1.000000  0.143178
6  0.857143  1.142857  0.116976
7  0.875000  0.500000  0.082007
8  0.888889  0.666667  0.098422
9  0.900000  0.400000  0.041250
sameer_nubia
  • 721
  • 8
  • 8