I have dataframe (df
) of the form :
SERV_OR_IOR_ID IMP_START_TIME IMP_CLR_TIME TIME_BIN
0 -1447310116 23:59:32.873000 00:11:28.755000
1 1673545041 00:00:09.182000 00:01:06.912000
2 -743717696 23:59:57.312000 00:00:32.428000
3 -746373244 23:59:57.915000 00:05:33.232000
I am dividing the 24hrs of a day into 'Time Bins' of 30 minutes, so the zeroth time bin would be from 00:00-00:30, first from 00:30-01:00 and so on.Post this, I want to assign a value in the 'TIME_BIN'
column for each row depending on which time window does the 'IMP_START_TIME'
fall in.
For example :
For the row '1'
, I would assign the value '0'
since it falls in the time window '00:00'-00:30
.
For this, I wrote the following code :
interval = dt.timedelta(minutes=30)
start = dt.time(0,0,0)
grid =[(dt.datetime.combine(dt.date(1,1,1),start)+n*interval).time() for n in range(48)]
for j in range(len(df)):
for i in range(0,47):
if df.ix[j,1] <grid[i+1] and df.ix[j,1] > grid[i]:
df.ix[j,3] = i
elif df.ix[j,1] > grid[47]:
df.ix[j,3] = 47
This code takes a lot of time to run because of the nested for loops. Is there a more efficient way to do the same thing ?