I am trying to calculate the rolling average of 4 days on the below dataset. The result should also be calculated based on group of 2 other columns.
For example:
df_time = pd.DataFrame({'A': [123, 123, 278, 278, 278, 123, 345, 278, 123,278, 278],
'B': [1, 2, 2, 2, 2, 1, 3, 2, 1, 2, 2],
'C': [0, 4, 2, 0, 4, 5, 3, 2, 1, 3, 2],
'D' : [pd.Timestamp('20130101'),
pd.Timestamp('20130102'),
pd.Timestamp('20130101'),
pd.Timestamp('20130102'),
pd.Timestamp('20130103'),
pd.Timestamp('20130103'),
pd.Timestamp('20130104'),
pd.Timestamp('20130104'),
pd.Timestamp('20130105'),
pd.Timestamp('20130106'),
pd.Timestamp('20130109')],
'rol_avg': [0, 2, 2, 1, 2, 5, 3, 3, 3, 2, 2.5] } )
df_time.sort_values(by='D')
sum_df = df_time.groupby(by=['A', 'B', 'D'])['C'].sum()
rolling_average_series = sum_df.groupby(level=[0, 1]).rolling(4, min_periods=1).mean()
returns
It is finding the rolling average of the past 4 occurrences of columns A and B. But what I need is rolling average of 4 days with respect to the Date column 'D' and grouped by column 'A' and 'B'
Here, for A=278 and B=2 for 09-01-2013 we have only 06-01-2013 which falls in 4days rolling average so the average should be (2+3)/2 = 2.5 How do I implement this functionality? Any help is greatly appreciated