DataFrame:
df = pd.DataFrame([['CategoryA', 400], ['CategoryA', 600],
['CategoryB', 1000], ['CategoryB', 2000]],
columns = ['Category', 'ItemTotal'])
I used the following to create a groupby for the total dollar amount by category:
df = df.groupby('Category')
df = df['ItemTotal'].sum()
df
The output looks like this:
Category
CategoryA 1000
CategoryB 3000
I want to eventually obtain another column besides the sum for share of each category like so:
Category
CategoryA 1000 .25
CategoryB 3000 .75
I'd like to know how to do this by inserting the cumulative dollar amount into a new column and then creating a new column for share by dividing.
The link that @jezrael provided How do I create a new column from the output of pandas groupby().sum()? returns the sum of each category next to every row in the original data frame:
df['Total'] = df['ItemTotal'].groupby(df['Category']).transform('sum')
df
Output:
Category ItemTotal Total
0 CategoryA 400 1000
1 CategoryA 600 1000
2 CategoryB 1000 3000
3 CategoryB 2000 3000
However, I'm looking for the sum of each category and the share of the total next to each unique category that has been grouped, not every row in the original dataframe.