1

i need to downsample a dataframe from hourly to daily. This is quite straightforward using pandas but I'm facing a problem that I'm failing to resolve. Data frame looks like this:

datetime prod
2018-03-13 19:00:00 38.700000
2018-03-13 20:00:00 38.700000
2018-03-14 00:00:00 38.600000
2018-03-15 08:00:00 38.200000
2018-03-15 11:00:00 38.100000
2018-03-15 14:00:00 38.100000
2018-03-15 15:00:00 38.100000
2018-03-15 21:00:00 38.100000
2018-03-16 00:00:00 38.000000
2018-03-16 06:00:00 38.000000
2018-03-16 12:00:00 38.000000
2018-03-16 15:00:00 37.900000
2018-03-16 19:00:00 38.000000
2018-03-16 20:00:00 37.900000
2018-03-17 09:00:00 37.900000
2018-03-17 20:00:00 37.700000

I run resample function like this:

df['prod'] = df['prod'].resample('24H').mean()

I've tried 'D' instead of '24H' and it always give me:

datetime prod
2018-03-14 38.600000
2018-03-16 37.966667
2018-03-28 36.625000
.... ...

It is missing the days that don't have values at 00:00:00. Any suggestion to fix this?

not_speshal
  • 22,093
  • 2
  • 15
  • 30
joelmoliv
  • 53
  • 6

1 Answers1

0

You can also use pd.Grouper as alternative of resample if datetime is a column and not your index.

out = df.groupby(pd.Grouper(key='datetime', freq='D')).mean().reset_index()
print(out)

# Output:
    datetime       prod
0 2018-03-13  38.700000
1 2018-03-14  38.600000
2 2018-03-15  38.120000
3 2018-03-16  37.966667
4 2018-03-17  37.800000
Corralien
  • 109,409
  • 8
  • 28
  • 52