1

there are many posts on this subject. I went through them and I couldn't find the answer to my question:

I am working on a pandas time series DataFrame. The DataFrame data is in Daily time-frame and I am aggregating it to Weekly time-frame, via the pandas library resample() function, such as below.

daily_df = #daily time series dataframe

def aggregate(daily_df, frequency): 
    weekly_df = daily_df.resample(frequency, on='date').agg({'open':'first','high':'max', 'low':'min','close':'last','volume':'sum'})
    df.reset_index(inplace=True)
    return weekly_df

weekly_df = aggregate(daily_df, 'W-Fri')

The issue I am running into is, some week's the time-series data only contains data from Monday to Thursday, but I don't know how to tell the resample() function to check for that and if it does, to end the week on the Thursday rather than the Friday; "W-Fri".

Bonsaye
  • 11
  • 4
  • could you please add a couple of rows of your dataframe (and code for generating the data)? this way you can be helped better – Hoori M. Jan 20 '21 at 03:50

1 Answers1

0

Since the resample function does not have that feature, we can determine the number of days resampled in a week by adding a flag for the number of days and tallying it.

import yfinance as yf
daily_df = yf.download("AAPL", start="2020-11-01", end="2020-12-31")

def aggregate(daily_df, frequency):
    daily_df.reset_index(inplace=True)
    daily_df['days'] = 1
    weekly_df = daily_df.resample(frequency, on='Date').agg({'Open':'first','High':'max', 'Low':'min','Close':'last','Volume':'sum','days':'count'})
    return weekly_df

weekly_df = aggregate(daily_df, 'W-Fri')
weekly_df

          Open  High     Low     Close    Volume    days
Date                        
2020-11-06  109.110001  119.620003  107.320000  118.690002  609571800   5
2020-11-13  120.500000  121.989998  114.129997  119.260002  589577900   5
2020-11-20  118.919998  120.989998  116.809998  117.339996  389493400   5
2020-11-27  117.180000  117.620003  112.589996  116.589996  365024000   4
2020-12-04  116.970001  123.779999  116.809998  122.250000  543809200   5
2020-12-11  122.309998  125.949997  120.150002  122.410004  452278700   5
2020-12-18  122.599998  129.580002  121.540001  126.660004  621866700   5
2020-12-25  125.019997  134.410004  123.449997  131.970001  433310200   4
2021-01-01  133.990005  138.789993  133.399994  133.720001  341985600   3
r-beginners
  • 31,170
  • 3
  • 14
  • 32
  • That's very neat, thank you. I realized I didn't ask my question correctly. For weeks that are not 5 day weeks, I need that reflected in the "Date" Column. For example, from your above data sample. "       Open High  Low   Close   Volume days Date 2020-11-27 117.180000 117.620003 112.589996 116.589996 365024000 4" Even though, this was a "4" day trading week, the "Date" column ends on the Friday of the week. "2020-11-27". The reason this matters is because I am going back and forth between the daily time frame and the weekly. Continued – Bonsaye Jan 20 '21 at 19:12
  • I am currently using a "week" column that adds the week number of the week to my daily timeframe. Very similar to your "days" column that counts the days of each aggregation of the week. However, the way I'm currently doing it becomes an issue when the dataset I have is more than 1 year because, the week numbers start all over again from the first week of the year to the last. I have no way to distinguish from a certain week say in 2019 vs 2020. I thought maybe I could use the date column to go back and forth between Daily vs Weekly dataframes. Maybe, I can deduct the difference in the "days" – Bonsaye Jan 20 '21 at 19:17
  • column from the 2 digits of the day(YYYY-MM-DD) in the date column and reset the index with that new data for the date column and in that way, I will have weeks that are shorter than 5 trading days reflect in the date column. As I am typing this, I think it might be easier for me to just shorten the dataframe every time I pass it from daily to weekly, so that way it will never exceed a certain number of indexes ( equal to a year) hmmm. Thank you for your input, it has made me think. – Bonsaye Jan 20 '21 at 19:20
  • If I wanted to keep track of week numbers, I would create a new column (yy-weeknum) with the year and week number concatenated, and then group and aggregate the year and week number by slicing the string according to the purpose. If my answer is acceptable, please accept my response. – r-beginners Jan 21 '21 at 02:14