2

I have a time series with a resolution of 5 min. The entries represent an amount of energy used in the preceding 5 minutes. I. e. the entry for 00:25 gives the energy usage from 00:20:01-00:25:00.

import pandas as pd

idx = pd.date_range(start='2021-11-30 00:00', end='2021-11-30 1:00', freq='5T')
ser = pd.Series(index=idx, data=range(len(idx)))

I need to resample the data to 15min intervals. I can do that with

ser.resample('15T', closed='right', label='right').sum()

which gives:

2021-11-30 00:00:00     0
2021-11-30 00:15:00     6
2021-11-30 00:30:00    15
2021-11-30 00:45:00    24
2021-11-30 01:00:00    33
Freq: 15T, dtype: int64

Exactly what I want. Reflecting on my code, I thought that my index should not consist of timestamps but of periods. So I did:

ser_period = ser.copy()
ser_period.index = ser_period.index.to_period(freq='5T')

As far as I see, nothing essential has changed. But if I do

ser_period.resample('15T', closed='right', label='right').sum()

I get

2021-11-29 23:45     3
2021-11-30 00:00    12
2021-11-30 00:15    21
2021-11-30 00:30    30
2021-11-30 00:45    12
2021-11-30 01:00     0
Freq: 15T, dtype: int64

I fiddled around with the args but didn't manage to fix this. And I do not understand why this is happening. Of course I can stick with the DatetimeIndex, but I am confused anyway. Why does the resampling lead to another result? And was I wrong in transforming the index? Isn't a time span of 5 min a reasonable period?"

Durtal
  • 1,063
  • 3
  • 11

0 Answers0