Made my own definition of MLK Day Holiday that adheres not to when the holiday was first observed, but by when it was first observed by the NYSE. The NYSE first observed MLK day in January of 1998.
When asking the Holiday for the days in which the holiday occurred between dates, it works fine for the most part, returning an empty set when the MLK date is not in the range requested, and returning the appropriate date when it is. For date ranges that precede the start_date
of the holiday, it appropriately returns the empty set, until we hit around 1995, and then it fails. I cannot figure out why it fails then and not in other situations when the empty set is the correct answer.
Note: Still stuck on Pandas 0.22.0. Python3
import pandas as pd
from datetime import datetime
from dateutil.relativedelta import MO
from pandas.tseries.holiday import Holiday
__author__ = 'eb'
mlk_rule = Holiday('MLK Day (NYSE Observed)',
start_date=datetime(1998, 1, 1), month=1, day=1,
offset=pd.DateOffset(weekday=MO(3)))
start = pd.to_datetime('1999-01-17')
end = pd.to_datetime('1999-05-01')
finish = pd.to_datetime('1980-01-01')
while start > finish:
print(f"{start} - {end}:")
try:
dates = mlk_rule.dates(start, end, return_name=True)
except Exception as e:
print("\t****** Fail *******")
print(f"\t{e}")
break
print(f"\t{dates}")
start = start - pd.DateOffset(years=1)
end = end - pd.DateOffset(years=1)
When run, this results in:
1999-01-17 00:00:00 - 1999-05-01 00:00:00:
1999-01-18 MLK Day (NYSE Observed)
Freq: 52W-MON, dtype: object
1998-01-17 00:00:00 - 1998-05-01 00:00:00:
1998-01-19 MLK Day (NYSE Observed)
Freq: 52W-MON, dtype: object
1997-01-17 00:00:00 - 1997-05-01 00:00:00:
Series([], dtype: object)
1996-01-17 00:00:00 - 1996-05-01 00:00:00:
Series([], dtype: object)
1995-01-17 00:00:00 - 1995-05-01 00:00:00:
****** Fail *******
Must provide freq argument if no data is supplied
What happens in 1995 that causes it to fail, that does not happen in the same periods in the years before?