0

I'd like to get a time series with a fixed set of dates in the index. I thought that resample with freq and epoch='origin' will do the trick. It seems that I'm using this method in a wrong way. Here's an example that shows that epoch='origin' does not seem to work.

import pandas as pd

dates = pd.date_range('2022-01-01', '2022-02-01', freq="1D")
freq = '2W-MON'
vals = range(len(dates))
print(
pd.Series(vals,index = dates)
    .resample(freq, 
              origin="epoch",
              convention='end')
    .sum()
    .to_markdown()
)
0
2022-01-03 00:00:00 3
2022-01-17 00:00:00 133
2022-01-31 00:00:00 329
2022-02-14 00:00:00 31

If I change the first date in the series to anything after the "2022-01-03", I get a different result.

dates = pd.date_range('2022-01-04', '2022-02-01', freq="1D")
freq = '2W-MON'
vals = range(len(dates))
print(
pd.Series(vals,index = dates)
    .resample(freq, 
              origin="epoch",
              convention='end')
    .sum()
    .to_markdown()
)
0
2022-01-10 00:00:00 21
2022-01-24 00:00:00 189
2022-02-07 00:00:00 196

I'd expect that if the freq='2W-MON' and epoch='origin', both the examples will end up with the same dates (so, both should have either 2022-01-10 or 2022-01-03).

Is there an elegant way of forcing pandas to actually use epoch="origin"?

Grzegorz Rut
  • 205
  • 1
  • 2
  • 8
  • 1
    indeed, it seems using this frequency (or other multiple week frequencies) does not work as you expect with origin set. A work around is to pick a random Monday, use 14D s frequency and then if really needed change the frequency to 2W-MON, then use as_freq. so something like `pd.Series(vals,index = dates).resample('14D', origin='2022-01-03').sum().asfreq('2W-MON')` gives the behavior I understand you are expecting! – Ben.T Aug 24 '22 at 20:29
  • Yup, this one works. Thank you! – Grzegorz Rut Aug 26 '22 at 09:20

0 Answers0