Pandas dataframe rolling sum column with groupby

Question

I'm trying to create a new column that gives a rolling sum of values in the Values column. The rolling sum includes 4 rows i.e. the current row and the next three rows. I want to do this for each type in the 'Type' column.

However, if there are fewer than 4 rows before the next type starts, I want the rolling sum to use only the remaining rows. For example, if there are 2 rows after the current row for the current type, a total of 3 rows is used for the rolling sum. See the table below showing what I'm currently getting and what I expect.

Index	Type	Value	Current Rolling Sum	Expected Rolling Sum
1	left	5	22	22
2	left	9	34	34
3	left	0	NaN	25
4	left	8	NaN	25
5	left	17	NaN	17
6	straight	7	61	61
7	straight	4	77	77
8	straight	0	86	86
9	straight	50	97	97
10	straight	23	NaN	47
11	straight	13	NaN	24
12	straight	11	NaN	11

The following line of code is what I'm currently using to get the rolling sum.

rolling_sum = df.groupby('Type', sort=False)['Value'].rolling(4, min_periods = 3).sum().shift(-3).reset_index()
rolling_sum = rolling_sum.rename(columns={'Value': 'Rolling Sum'})

extracted_col = rolling_sum['Rolling Sum']
df = df.join(extracted_col)

I would really appreciate your help.

score 3 · Accepted Answer · answered Jul 13 '21 at 07:08

You can try running the rolling sum on the reversed values for each group and then reverse back afterward, using a min_periods of 1:

df['Rolling Sum'] = df.groupby('Type', sort=False)['Value'].apply(lambda x: x[::-1].rolling(4, min_periods=1).sum()[::-1])

Result:

   Index        Type    Value   Rolling Sum
0      1        left        5          22.0
1      2        left        9          34.0
2      3        left        0          25.0
3      4        left        8          25.0
4      5        left       17          17.0
5      6    straight        7          61.0
6      7    straight        4          77.0
7      8    straight        0          86.0
8      9    straight       50          97.0
9     10    straight       23          47.0
10    11    straight       13          24.0
11    12    straight       11          11.0

after a full day looking for a working solution, it's finally here! thanks! — Pedro Alonso, Aug 17 '22 at 19:49

Pandas dataframe rolling sum column with groupby

1 Answers1