0

I know there's some questions about this topic (like Pandas: Cumulative sum of one column based on value of another) however, none of them fuull fill my requirements.

Let's say I have a dataframe like this one

Original dataframe.

I want to compute the cumulative sum of Cost grouping by month, avoiding taking into account the current value, in order to get the Desired column.By using groupby and cumsum I obtain colum CumSum

Output dataframe.

The DDL to generate the dataframe is

df = pd.DataFrame({'Month': [1,1,1,2,2,1,3],
                   'Cost': [5,8,10,1,3,4,1]})
ALollz
  • 57,915
  • 7
  • 66
  • 89
MatMorPau22
  • 346
  • 2
  • 3
  • 18

2 Answers2

3

IIUC you can use groupby.cumsum and then just subtract cost;

df['cumsum_'] = df.groupby('Month').Cost.cumsum().sub(df.Cost)

print(df)

    Month  Cost  cumsum_
0      1     5        0
1      1     8        5
2      1    10       13
3      2     1        0
4      2     3        1
5      1     4       23
6      3     1        0
yatu
  • 86,083
  • 12
  • 84
  • 139
1

You can do the following:

df['agg']=df.groupby('Month')['Cost'].shift().fillna(0)
df['Cumsum']=df['Cost']+df['agg']
Raghul Raj
  • 1,428
  • 9
  • 24