Sample DataFrame :
process_id | app_path | start_time
the desired output data frame should be multi-Indexed based on the date and time value in start_time column with unique dates as first level of index and one hour range as second level of index the count of records in each time slot should be calculated
def activity(self):
# find unique dates from db file
columns = self.df['start_time'].map(lambda x: x.date()).unique()
result = pandas.DataFrame(np.zeros((1,len(columns))), columns = columns)
for i in range(len(self.df)):
col = self.df.iloc[i]['start_time'].date()
result[col][0] = result.get_value(0, col) + 1
return result
I have tried the above code which gives the output as :
15-07-2014 16-7-2014 17-07-2014 18-07-2014
3217 2114 1027 3016
I want to count records on per hour basis as well