Suppose I have a data frame as follows:
df = pd.DataFrame(np.array([[2018,'R1','C1',1],[2018,'R1','C2',2],[2018,'R1','C3',3],[2018,'R1','C4',4],[2018,'R1','C5',5],[2018,'R2','C6',6],[2018,'R2','C7',7],[2018,'R2','C8',8],[2018,'R2','C9',9],[2018,'R2','C10',10]]),columns=['Year', 'Region', 'Country', 'Spend'])
I'd like to add summary lines to each possible group by
(groups
) so that the output dataframe looks something like this:
I have learned how to add summary/total lines as a column based on this StackOverflow post. But I would like to have these total lines merged to the original data frame as show in the above screenshot and I would like to accomplish this in as few lines as possible (that is, avoiding having to call different groupby
combinations manually like below):
df['ByYearTotalCount'] = df.groupby(['Year'])['Spend'].transform('sum')
df['ByYearByRegionTotalCount'] = df.groupby(['Year','Region'])['Spend'].transform('sum')
Could anyone help me come up with the best pandas
-like way to accomplish this? Thank you in advance for your help!