0

I want to divide the daily data into 5 groups. Each starts from a different day with a fixed frequency of 5 business days. It's something like all the Monday put together and all the Tuesday put together. I use the resample function.

df1 = df.resample('5B').first()
df2 = df.resample('5B', offset=1).first()
df3 = df.resample('5B', offset=2).first()

I was expecting that df1 starts from, let's say, 2000-01-03, df2 starts from 2000-01-04 and df3 starts from 2000-01-05. But the result shows that both df2 and df3 start from 2000-01-03. Is my understanding of offset wrong?

Lei Hao
  • 708
  • 1
  • 7
  • 21

1 Answers1

1

I'm assuming a DataFrame with the date as index and datetime type. For instance df = pd.DataFrame({'col': range(32)}, index=pd.date_range('2000-01-03', '2000-02-03'))

If you want to split your data by weekday, use dt.weekday (0->Monday to 6->Sunday) and groupby in a dictionary comprehension (or a loop for saving to file):

dfs = {f'df{i+1}': d
       for i,d in df.groupby(df.index.weekday)
       if i<6}

Example output:

{'df1':             col
 2000-01-03    0
 2000-01-10    7
 2000-01-17   14
 2000-01-24   21
 2000-01-31   28,
 'df2':             col
 2000-01-04    1
 2000-01-11    8
 2000-01-18   15
 2000-01-25   22
 2000-02-01   29,
 'df3':             col
 2000-01-05    2
 2000-01-12    9
 2000-01-19   16
 2000-01-26   23
 2000-02-02   30,
 'df4':             col
 2000-01-06    3
 2000-01-13   10
 2000-01-20   17
 2000-01-27   24
 2000-02-03   31,
 'df5':             col
2000-01-07    4
 2000-01-14   11
 2000-01-21   18
 2000-01-28   25,
 'df6':             col
 2000-01-08    5
 2000-01-15   12
 2000-01-22   19
 2000-01-29   26}
mozway
  • 194,879
  • 13
  • 39
  • 75