2

I need to add a new feature that aggregates the last 5 data. When it adds 6th data, then it should forget the first data and consider only the last 5 data sets as shown below. Here is the dummy data frame, new_feature is the expected output.

id    feature    new_feature

1       a            a
2       b            a+b
3       c            a+b+c
4       d            a+b+c+d
5       e            a+b+c+d+e
6       f            b+c+d+e+f
7       g            c+d+e+f+g
Divya
  • 23
  • 4

1 Answers1

1

Use Series.rolling with min_periods=1 parameter and sum:

df = pd.DataFrame({'feature':[1,2,4,5,6,2,3,4,5]})
df['new_feature'] = df['feature'].rolling(5, min_periods=1).sum()
print (df)
   feature  new_feature
0        1          1.0
1        2          3.0
2        4          7.0
3        5         12.0
4        6         18.0
5        2         19.0
6        3         20.0
7        4         20.0
8        5         20.0
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252