I have a dataframe of 20-years of precipitation for 5 stations. The observations are non-continuously 15minutes collected for 8-hours a day. I want to find the station and the day that the total 8hours (7am to 10am, 13 to 14, and 16pm to 17pm) of precipitation is maximum.
What I want to do, is first resample the data to hourly data for each station and then fins the overlapped 8 hours for each location and then find the maximum value.
my data frame:
time_start | obs_id | station_id | precipition |
---|---|---|---|
2000-01-11 07:00:00-05:00 | 1 | st_1 | 10 |
2000-01-11 07:30:00-05:00 | 1 | st_1 | 2 |
2000-01-11 07:45:00-05:00 | 1 | st_1 | 1 |
2000-01-11 09:00:00-05:00 | 1 | st_1 | 3 |
2000-01-11 09:15:00-05:00 | 1 | st_1 | 1 |
2000-01-11 09:30:00-05:00 | 1 | st_1 | 0 |
2000-01-11 09:45:00-05:00 | 1 | st_1 | 1 |
2000-01-12 07:00:00-05:00 | 1 | st_1 | 10 |
2000-01-12 07:30:00-05:00 | 2 | st_1 | 2 |
2000-01-12 07:45:00-05:00 | 2 | st_1 | 1 |
2000-01-12 09:00:00-05:00 | 2 | st_1 | 3 |
2000-01-12 09:15:00-05:00 | 2 | st_1 | 1 |
2000-01-12 09:30:00-05:00 | 2 | st_1 | 0 |
2000-01-12 09:45:00-05:00 | 2 | st_1 | 1 |
2000-01-11 07:00:00-05:00 | 1 | st_2 | 10 |
2000-01-11 07:30:00-05:00 | 1 | st_2 | 2 |
2000-01-11 07:45:00-05:00 | 1 | st_2 | 1 |
2000-01-11 09:00:00-05:00 | 1 | st_2 | 3 |
2000-01-11 09:15:00-05:00 | 1 | st_2 | 1 |
2000-01-11 09:30:00-05:00 | 1 | st_2 | 0 |
2000-01-11 09:45:00-05:00 | 1 | st_2 | 1 |
2000-01-12 07:00:00-05:00 | 1 | st_2 | 10 |
2000-01-12 07:30:00-05:00 | 2 | st_2 | 2 |
2000-01-12 07:45:00-05:00 | 2 | st_2 | 1 |
2000-01-12 09:00:00-05:00 | 2 | st_2 | 3 |
2000-01-12 09:15:00-05:00 | 2 | st_2 | 1 |
2000-01-12 09:30:00-05:00 | 2 | st_2 | 0 |
2000-01-12 09:45:00-05:00 | 2 | st_2 | 1 |
I used this code but, it does not work.
df_H = df.resample('H', on='time_start', closed='right').sum().reset_index()
I want to have a table that I have for each location a sorted time with the sum of precipitation