How do you group data by time buckets and count no of observation in the given bucket. If none, fill the empty time buckets with 0s.
I have the following data set in a dataframe: '''
df=
Time
0:10
5:00
5:00
5:02
5:03
5:05
5:07
5:09
6:00
6:00
6:00
'''
I would like to create 5 min time bucket going from 00:00 to 23:59, and count how many times it appears in that time bucket. If none, then 0. Basically, each time represents a unit in a queue and and I want to calculate how many in the given time bucket.
From the above data (example set), i would like to get the following:
Time Obs
00:00 0
00:05 0
00:10 1
00:15 0
...
05:00 2
05:05 3
05:10 2
06:00 3
...
I tried the following code
df['time_bucket'] = pd.to_datetime(df['Time']).dt.ceil('5min')
which did not work.
I tried the following as well:
df1= df.resample('5T', on ='time_bucket').count()
which results in :
Time time_bucket
time_bucket
2020-05-24 00:10:00 1 1
2020-05-24 00:15:00 0 0
2020-05-24 00:20:00 0 0
2020-05-24 00:25:00 0 0
2020-05-24 00:30:00 0 0
The time starts at 00:10 but not at 00:00; seems like it starts from the initial value of the time_bucket column.
Basically in the new column, I want to calculate the count. Eventually, I would like to create a function which takes a parameter, ex: time buckets (5, 10, 15) and create table for given time bucket with counts.