I have a time series with a resolution of 5 min. The entries represent an amount of energy used in the preceding 5 minutes. I. e. the entry for 00:25 gives the energy usage from 00:20:01-00:25:00.
import pandas as pd
idx = pd.date_range(start='2021-11-30 00:00', end='2021-11-30 1:00', freq='5T')
ser = pd.Series(index=idx, data=range(len(idx)))
I need to resample the data to 15min intervals. I can do that with
ser.resample('15T', closed='right', label='right').sum()
which gives:
2021-11-30 00:00:00 0
2021-11-30 00:15:00 6
2021-11-30 00:30:00 15
2021-11-30 00:45:00 24
2021-11-30 01:00:00 33
Freq: 15T, dtype: int64
Exactly what I want. Reflecting on my code, I thought that my index should not consist of timestamps but of periods. So I did:
ser_period = ser.copy()
ser_period.index = ser_period.index.to_period(freq='5T')
As far as I see, nothing essential has changed. But if I do
ser_period.resample('15T', closed='right', label='right').sum()
I get
2021-11-29 23:45 3
2021-11-30 00:00 12
2021-11-30 00:15 21
2021-11-30 00:30 30
2021-11-30 00:45 12
2021-11-30 01:00 0
Freq: 15T, dtype: int64
I fiddled around with the args but didn't manage to fix this. And I do not understand why this is happening. Of course I can stick with the DatetimeIndex, but I am confused anyway. Why does the resampling lead to another result? And was I wrong in transforming the index? Isn't a time span of 5 min a reasonable period?"