I am having some trouble with pandas rolling. Here a simplify version of my dataset:
df2 = pd.DataFrame({
'A' : pd.Categorical(["test","train","test","train",'train','hello']),
'B' : (pd.Timestamp('2013-01-02 00:00:05'),
pd.Timestamp('2013-01-02 00:00:10'),
pd.Timestamp('2013-01-02 00:00:09'),
pd.Timestamp('2013-01-02 00:01:05'),
pd.Timestamp('2013-01-02 00:01:25'),
pd.Timestamp('2013-01-02 00:02:05')),
'C' : 1.}).sort_values('A').reset_index(drop=True)
>>> df2
A B C
0 hello 2013-01-02 00:02:05 1.0
1 test 2013-01-02 00:00:05 1.0
2 test 2013-01-02 00:00:09 1.0
3 train 2013-01-02 00:00:10 1.0
4 train 2013-01-02 00:01:05 1.0
5 train 2013-01-02 00:01:25 1.0
I would like to have a rolling window of 10s, to get the following output:
A count
0 hello 1
1 test 2
3 train 1
I try the groupby and rolling.
df2.groupby('A').rolling('10s', on='B', closed='right').C.sum()
I get the rolling windows from the past '10s' observation, which is not what i am looking for:
A B
hello 2013-01-02 00:02:05 1.0
test 2013-01-02 00:00:05 1.0
2013-01-02 00:00:09 2.0
train 2013-01-02 00:00:10 1.0
2013-01-02 00:01:05 1.0
2013-01-02 00:01:25 1.0
I also try resampling, but I am not able to get the result.
grouped = df3.set_index('B').groupby('A').resample('S' )['C'].count()
grouped.reset_index().groupby('A').rolling(window=10,on='B' , min_periods=1).sum()