How to compute a moving average of value for the previous date untill the first event?

Question

I have a dataframe with data where the first column is an identification number, ID1, the second is a date, DATE, and the third is some value, VALUE.

d = {'ID1': [1,2,3,4,1,2,4,1,3,2,4,1],
       'DATE': ['1/06/2016', '1/06/2016','2/06/2016','1/06/2016','3/06/2016', '4/06/2016','2/06/2016','5/06/2016','1/06/2016', '2/06/2016','2/06/2016','4/06/2016'], 'VALUE':[1.0, 3.0, 4.0, 2.0, 5.0, 0.6, 9.0, 10.0, 8.0, 100.0, 23.0, 1.0]}

df = pd.DataFrame(d)

I want to compute the average value, for each ID1, for the past dates. So, for instance, in the first row, where ID1 = 1, I would have a value of 5.33, for the second row, where ID1=2, I would have 50.3, and so on and so forth. If the last value is reached (for instance, the last value of ID1=1), the moving average should be the value of VALUE (1.0 in this case).

I know the existence of the rolling function, but I do not see exactly how to apply it here. I guess I should do some re-indexing, with the DATE column and make a groupby in order to group the data by the value of ID1 column.

Can someone give me some advice? Thanks!

score 1 · Answer 1 · answered Jun 10 '19 at 00:18

1

I think you are looking for expanding

s=df.groupby('ID1').VALUE.expanding(min_periods=1).mean().reset_index(level=0,drop=True)
df['new']=s

answered Jun 10 '19 at 00:18

BENY

317,841
20
164
234

Thanks @WeNYoBen. What you suggest does the trick, except that I want, for each row, the average of the previous dates, not including the date of the row. I would need some shift on the values. – Pere Munar Jun 10 '19 at 21:41

How to compute a moving average of value for the previous date untill the first event?

1 Answers1