0

I have this df that contains the first and last of the month and number of business days in each month as determined by the Python holidays library:

PredictionTargetDateEOM PredictionTargetDateBOM DayAfterTargetDateEOM   business_days
0   2018-12-31              2018-12-01              2019-01-01              20
1   2019-01-31              2019-01-01              2019-02-01              21
2   2019-02-28              2019-02-01              2019-03-01              20
3   2018-11-30              2018-11-01              2018-12-01              21
4   2018-10-31              2018-10-01              2018-11-01              23
              ...   ... ... ... ...
172422  2020-10-31          2020-10-01              2020-11-01              22
172423  2020-11-30          2020-11-01              2020-12-01              20
172424  2020-12-31          2020-12-01              2021-01-01              22
172425  2020-09-30          2020-09-01              2020-10-01              21
172426  2020-08-31          2020-08-01              2020-09-01              21

Generated with this code:

predicted_df['PredictionTargetDateBOM'] = predicted_df.apply(lambda x: pd.to_datetime(x['PredictionTargetDateEOM']).replace(day=1), axis = 1) #Get first day of the target month
predicted_df['PredictionTargetDateEOM'] = pd.to_datetime(predicted_df['PredictionTargetDateEOM'])
predicted_df['DayAfterTargetDateEOM'] = predicted_df['PredictionTargetDateEOM'] + timedelta(days=1) #Get the first day of the month after target month. i.e. M+2
predicted_df['business_days'] = predicted_df.apply(lambda x: np.busday_count(x['PredictionTargetDateBOM'].date(), x['DayAfterTargetDateEOM'].date(), holidays=[list(holidays.US(years=x['PredictionTargetDateBOM'].year).keys())[index] for index in [list(holidays.US(years=x['PredictionTargetDateBOM'].year).values()).index(item) for item in rocket_holiday_including_observed if item in list(holidays.US(years=x['PredictionTargetDateBOM'].year).values())]] ), axis = 1) #Count number of business days of the target month

I want to add this column:

predicted_df['business_days_rocket'] = predicted_df.apply(lambda x: np.busday_count(x['PredictionTargetDateBOM'].date(), x['DayAfterTargetDateEOM'].date(), holidays=[list({k: v for k, v in holidays.US(years=x['PredictionTargetDateBOM'].year).items() if v in my_set})]), axis = 1)

Based on this as my_set:

my_list = [
    "New Year's Day",
    "Martin Luther King Jr. Day",
    "Memorial Day",
    "Independence Day",
    "Labor Day",
    "Thanksgiving",
    "Christmas Day",
    "New Year's Day (Observed)",
    "Martin Luther King Jr. Day (Observed)",
    "Memorial Day (Observed)",
    "Independence Day (Observed)",
    "Labor Day (Observed)",
    "Thanksgiving (Observed)",
    "Christmas Day (Observed)",
]

my_set = set(my_list)

But I get ValueError: holidays must be a provided as a one-dimensional array.

I don't understand the error because my list is a one-dimensional array. The output of list({k: v for k, v in holidays.US(years=x['PredictionTargetDateBOM'].year).items() if v in my_set}) is:

[datetime.date(2022, 1, 1),
 datetime.date(2022, 1, 17),
 datetime.date(2022, 5, 30),
 datetime.date(2022, 7, 4),
 datetime.date(2022, 9, 5),
 datetime.date(2022, 11, 24),
 datetime.date(2022, 12, 25),
 datetime.date(2022, 12, 26)]

Which is the same output format-wise as:

holidays=[list(holidays.US(years=x['PredictionTargetDateBOM'].year).keys())
Hefe
  • 421
  • 3
  • 23
  • Why not just `list(k for k, v in holidays.US(years=x['PredictionTargetDateBOM'].year).items() if v in my_set)`? You don't need to build a dictionary at all. – Tim Roberts Aug 08 '22 at 18:08
  • I think I have to do that because the `holidays` package outputs a dictionary, otherwise I get a Syntax Error if I try to use normal parentheses. See this question: https://stackoverflow.com/q/73279980/15975987 – Hefe Aug 08 '22 at 18:11
  • Nonsense. What YOUR code does is create a temporary dictionary (`{k:v for ...`) and then convert it to a list, which just extracts the keys. If all you need is the keys, then just keep the key. – Tim Roberts Aug 08 '22 at 18:16
  • 1
    I mean `holidays = [k for k,v in holidays.US(years=x['PredictionTargetDateBOM'].year).items() if v in my_set]` – Tim Roberts Aug 08 '22 at 18:32
  • Nice one. Sorry, I made a mistake in my test code and that was why I was getting the attribute error. However, the output of that for me is still the same as what I listed above. A 1-D array of dates, so I believe it would still throw the same Value Error. – Hefe Aug 08 '22 at 18:35

0 Answers0