Python - Downsample using resample not using average/mean

Question

Hy guys

i must be missing something very obvious but,
i have a datetime series with hourly rate. I need to downsample it to daily rate, which is pretty simple using resample('D').
But i cannont downsample it using mean. I need for example to choose one hour of the day (00:00h for example) and use it as the value to the given day. Before:

datetime              values
2018-05-08 00:00:00     0.1
2018-05-08 01:00:00     0.5
2018-05-08 02:00:00     0.7
2018-05-08 03:00:00     0.4
2018-05-08 04:00:00     0.7

Desired Output

datetime              values
2018-05-08             0.1

Is there any method in resample or should i use another method?

Best

Edit

first i have big datetime series.

datetime              values
2018-05-08 00:00:00     0.1
2018-05-08 01:00:00     0.5
2018-05-08 02:00:00     0.7
2018-05-08 03:00:00     0.4
2018-05-08 04:00:00     0.7

then i have applied a running average mantaining the hourly rate.

df['values'] = df['values'].rolling(168).mean(center=True)

i use 168 because i need 3 days before and 3 days after with hourly rate.
And from here i need to downsample, but if i use the standard resample method it will average it again.

df = df.resample('D').mean()

@pault the thing is that I can't average it. i will provide more info in the following comment. — joelmoliv, Jan 17 '19 at 17:22
@joel [edit] your question to provide more info. Don't add it in the comments. Also provide your desired output. — pault, Jan 17 '19 at 17:23

Graipher · Accepted Answer · 2019-01-17T17:34:32.133

You can apply whatever function you want. Some of them are just already implemented for you (like mean, sum, but also first and last):

df.resample('D').first()
#             values
# datetime          
# 2018-05-08     0.1

But you can just apply any function you want, it will be passed the whole group to operate on, just like groupby.

This for example takes the last time before 2 am (assuming the dataframe is already sorted by the index):

import datetime

def last_before_2_am(group):
    before_2_am = group[group.index.time < datetime.time(2, 0, 0)]
    return before_2_am.iloc[-1]

df.resample('D').apply(last_before_2_am)
#             values
# datetime          
# 2018-05-08     0.5

Python - Downsample using resample not using average/mean

1 Answers1