0

I have a dataframe with 15minutes observations for 10 years. I want to find the 15minutes that has often the highest value over the years.

time_start  count_id    location_id obs
time                
2000-03-07 07:30:00 2000-03-07 07:30:00-05:00   8182    3939    2.0
2000-03-07 07:45:00 2000-03-07 07:45:00-05:00   8182    3939    0.0
2000-03-07 08:00:00 2000-03-07 08:00:00-05:00   8182    3939    13.0
2000-03-07 08:15:00 2000-03-07 08:15:00-05:00   8182    3939    12.0
2000-03-07 08:30:00 2000-03-07 08:30:00-05:00   8182    3939    6.0
... ... ... ... ...
2000-03-01 17:45:00 2000-03-01 17:45:00-05:00   8193    5600    40.0
2000-01-11 07:30:00 2000-01-11 07:30:00-05:00   8194    5601    59.0
2000-01-11 07:45:00 2000-01-11 07:45:00-05:00   8194    5601    50.0
2000-01-11 08:00:00 2000-01-11 08:00:00-05:00   8194    5601    37.0
2000-01-11 08:15:00 2000-01-11 08:15:00-05:00   8194    5601    31.0

I used the following code to create a histogram of the average of 10years of observations (obs) for each 15minutes in 24 hours and have the highest peds_sum with the darkest colour.

counts_df = stationData10['obs'].groupby([stationData10.index.time, pd.Grouper(freq='15Min')]).mean().to_frame(name='n')
counts_df.rename_axis(['15Min', 'day'], inplace=True)
counts_df.hvplot.heatmap(title='Record count', x='15Min', y='day', C='n', width=FIGSIZE[0], height=FIGSIZE[1])
GeoBeez
  • 920
  • 2
  • 12
  • 20
  • please post a sample of your data so we can reproduce your issue – RJ Adriaansen Aug 28 '21 at 19:54
  • Let me see if I can help, though I'm confused; what does this code do that is different from what you want? Also, can you provide a minimal example of the dataframe as text or as code that can be copied directly? – Steele Farnsworth Aug 28 '21 at 19:54
  • @Steele Farnsworth I added a table. It is the same for 10 years. it does not print anything :( the count_id changes with the date and location, so it is not a unique identifier. – GeoBeez Aug 28 '21 at 20:25

1 Answers1

0

I solved my problem: to get 8hours that are discrete,

df1 = df.between_time('7:30','11:30')
df2 = df.between_time('13:00','14:45')
df3= df.between_time('16:00','17:45')
df_final=pd.DataFrame().append([df1,df2, df3])

and then,

total_df_final = df_final.groupby(['count_id','location_id'])['obs'].sum()
print ("AT the location{} on day of {}, the maximum observation of {} were recoded.".
       format (total_df_final .idxmax()[1],
               total_df_final .idxmax()[0],total_df_final .max()))
GeoBeez
  • 920
  • 2
  • 12
  • 20