0

Here is a simplified version of my dataframe (the number of persons in my dataframe is way more than 3):

df = pd.DataFrame({'Person':['John','David','Mary','John','David','Mary'],
               'Sales':[10,15,20,11,12,18],
               })
  Person  Sales
0   John     10
1  David     15
2   Mary     20
3   John     11
4  David     12
5   Mary     18

I would like to add a column "Total" to this data frame, which is the sum of total sales per person

  Person  Sales  Total
0   John     10     21
1  David     15     27
2   Mary     20     38
3   John     11     21
4  David     12     27
5   Mary     18     38

What would be the easiest way to achieve this?

I have tried

df.groupby('Person').sum()

but the shape of the output is not congruent with the shape of df.

        Sales
Person       
David      27
John       21
Mary       38
Pepey
  • 5
  • 3

3 Answers3

2

The easiest way to achieve this is by using the pandas groupby and sum functions.

df['Total'] = df.groupby('Person')['Sales'].sum()

This will add a column to the dataframe with the total sales per person.

Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
cipher
  • 31
  • 2
  • I tried that and indeed I got a new column "Total", but all the values in Total are NaN's. I checked the values of Sales and they are int64. – Pepey Feb 08 '23 at 08:30
2

What you want is the transform method which can apply a function on each group:

df['Total'] = df.groupby('Person')['Sales'].transform(sum)

It gives as expected:

  Person  Sales  Total
0   John     10     21
1  David     15     27
2   Mary     20     38
3   John     11     21
4  David     12     27
5   Mary     18     38
Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
0

your 'Persons' column in the dataframe contains repeated values
it is not possible to apply a new column to this via groupby

I would suggest making a new dataframe based on sales sum
The below code will help you with that

newDf = pd.DataFrame(df.groupby('Person')['Sales'].sum()).reset_index()

This will create a new dataframe with 'Person' and 'sales' as columns.

Anirudh B M
  • 55
  • 1
  • 6