0

I have generated this df

PredictionTargetDateEOM PredictionTargetDateBOM DayAfterTargetDateEOM   business_days
0   2018-12-31              2018-12-01              2019-01-01              20
1   2019-01-31              2019-01-01              2019-02-01              21
2   2019-02-28              2019-02-01              2019-03-01              20
3   2018-11-30              2018-11-01              2018-12-01              21
4   2018-10-31              2018-10-01              2018-11-01              23
              ...   ... ... ... ...
172422  2020-10-31          2020-10-01              2020-11-01              22
172423  2020-11-30          2020-11-01              2020-12-01              20
172424  2020-12-31          2020-12-01              2021-01-01              22
172425  2020-09-30          2020-09-01              2020-10-01              21
172426  2020-08-31          2020-08-01              2020-09-01              21

with this code:

predicted_df['PredictionTargetDateBOM'] = predicted_df.apply(lambda x: pd.to_datetime(x['PredictionTargetDateEOM']).replace(day=1), axis = 1) #Get first day of the target month
predicted_df['PredictionTargetDateEOM'] = pd.to_datetime(predicted_df['PredictionTargetDateEOM'])
predicted_df['DayAfterTargetDateEOM'] = predicted_df['PredictionTargetDateEOM'] + timedelta(days=1) #Get the first day of the month after target month. i.e. M+2
predicted_df['business_days_bankers'] = predicted_df.apply(lambda x: np.busday_count(x['PredictionTargetDateBOM'].date(), x['DayAfterTargetDateEOM'].date(), holidays=[list(holidays.US(years=x['PredictionTargetDateBOM'].year).keys())[index] for index in [list(holidays.US(years=x['PredictionTargetDateBOM'].year).values()).index(item) for item in rocket_holiday_including_observed if item in list(holidays.US(years=x['PredictionTargetDateBOM'].year).values())]] ), axis = 1) #Count number of business days of the target month

That counts the number of business days in the month of the PredictionTargetDateEOM column based on Python's holiday package, which is a dictionary that includes the following holidays:

2022-01-01 New Year's Day
2022-01-17 Martin Luther King Jr. Day
2022-02-21 Washington's Birthday
2022-05-30 Memorial Day
2022-06-19 Juneteenth National Independence Day
2022-06-20 Juneteenth National Independence Day (Observed)
2022-07-04 Independence Day
2022-09-05 Labor Day
2022-10-10 Columbus Day
2022-11-11 Veterans Day
2022-11-24 Thanksgiving
2022-12-25 Christmas Day
2022-12-26 Christmas Day (Observed)

However, I would like to replicate the business day count but instead use this list called rocket_holiday as the reference for np.busday_count():

["New Year's Day",
 'Martin Luther King Jr. Day',
 'Memorial Day',
 'Independence Day',
 'Labor Day',
 'Thanksgiving',
 'Christmas Day',
 "New Year's Day (Observed)",
 'Martin Luther King Jr. Day (Observed)',
 'Memorial Day (Observed)',
 'Independence Day (Observed)',
 'Labor Day (Observed)',
 'Thanksgiving (Observed)',
 'Christmas Day (Observed)']

So I've added this line predicted_df['business_days_rocket'] = predicted_df.apply(lambda x: np.busday_count(x['PredictionTargetDateBOM'].date(), x['DayAfterTargetDateEOM'].date(), holidays=[rocket_holiday]), axis = 1)

But I get the ValueError listed in the title of this question. I think the problem is that the first list is a dictionary with the dates of those holidays, so I need to write a function that could generate those dates for the holidays of the second list in a dynamic fashion based on year, and convert that list into a dictionary. Is there a way to do that with Python's holiday package so that I don't have to hard-code the dates in?

Hefe
  • 421
  • 3
  • 23
  • ?? Most of those do not have "Observed" days. New Years is always 1/1, Independence is always 7/4, Thanksgiving is always 4th Thursday. You can certainly filter that list. Check the `pop_named` method. – Tim Roberts Aug 04 '22 at 21:26
  • The "observed" part doesn't really matter. I just need to count the number of business days in the months that those holidays occur in. – Hefe Aug 05 '22 at 13:29
  • If you want to use `apply` to apply to rows, you need to add `axis=1` as a parameter. By default, `apply` applies to columns. – Tim Roberts Aug 05 '22 at 19:24
  • Ah, darn. Right. Thank you, will give that a shot. The only thing is, the second list doesn't have dates attached to the years so I'm not sure how if that will cause any issues. – Hefe Aug 05 '22 at 19:27
  • Well, your substitute holiday list must be the same format as the original. As I said, it has a property to allow you to remove holidays from the list. – Tim Roberts Aug 06 '22 at 00:40

0 Answers0