2

Hy guys

i must be missing something very obvious but,
i have a datetime series with hourly rate. I need to downsample it to daily rate, which is pretty simple using resample('D').
But i cannont downsample it using mean. I need for example to choose one hour of the day (00:00h for example) and use it as the value to the given day. Before:

datetime              values
2018-05-08 00:00:00     0.1
2018-05-08 01:00:00     0.5
2018-05-08 02:00:00     0.7
2018-05-08 03:00:00     0.4
2018-05-08 04:00:00     0.7

Desired Output

datetime              values
2018-05-08             0.1

Is there any method in resample or should i use another method?

Best

Edit

first i have big datetime series.

datetime              values
2018-05-08 00:00:00     0.1
2018-05-08 01:00:00     0.5
2018-05-08 02:00:00     0.7
2018-05-08 03:00:00     0.4
2018-05-08 04:00:00     0.7

then i have applied a running average mantaining the hourly rate.

df['values'] = df['values'].rolling(168).mean(center=True)   

i use 168 because i need 3 days before and 3 days after with hourly rate.
And from here i need to downsample, but if i use the standard resample method it will average it again.

df = df.resample('D').mean()
Jan Christoph Terasa
  • 5,781
  • 24
  • 34
joelmoliv
  • 53
  • 6
  • @pault the thing is that I can't average it. i will provide more info in the following comment. – joelmoliv Jan 17 '19 at 17:22
  • 2
    @joel [edit] your question to provide more info. Don't add it in the comments. Also provide your desired output. – pault Jan 17 '19 at 17:23

1 Answers1

3

You can apply whatever function you want. Some of them are just already implemented for you (like mean, sum, but also first and last):

df.resample('D').first()
#             values
# datetime          
# 2018-05-08     0.1

But you can just apply any function you want, it will be passed the whole group to operate on, just like groupby.

This for example takes the last time before 2 am (assuming the dataframe is already sorted by the index):

import datetime

def last_before_2_am(group):
    before_2_am = group[group.index.time < datetime.time(2, 0, 0)]
    return before_2_am.iloc[-1]

df.resample('D').apply(last_before_2_am)
#             values
# datetime          
# 2018-05-08     0.5
Graipher
  • 6,891
  • 27
  • 47