I have two columns in pandas dataframe, start date and end date.
I want to know if any holiday is included in the time period of each row.
I want to create a new column to show yes or no.
id Start Date End Date
0 2019-09-27 2019-10-06
1 2019-10-09 2019-10-22
2 2019-05-04 2019-05-15
3 2019-09-18 2019-09-29
I know how to check if a specific date is holiday or not
But how can I check the duration of each row?
import pandas as pd
from pandas.tseries.holiday import USFederalHolidayCalendar as calendar
df = pd.DataFrame({'Start Date':['2019-09-27', '2019-10-09', '2019-05-04', '2019-09-18'],
'End Date':['2019-10-06', '2019-10-22', '2019-05-15', '2019-09-29']})
# To check if a specific date is a holiday or not
holidays = calendar().holidays(start = df['Start Date'].min(), end = df['Start Date'].max())
df['Holiday'] = df['Start Date'].isin(holidays)
# This can only check if the start date is a holiday
id Start Date Holiday
0 2019-09-27 False
1 2019-10-09 False
2 2019-05-04 False
3 2019-09-18 False
# But how can I check the duration between df['Start Date'] and df['End Date'] of each row?
I expect that there is another Boolean column to check if each row (id) is including any holiday between the duration of start date and end date
id Start Date End Date Holiday
0 2019-09-27 2019-10-06 True
1 2019-10-09 2019-10-22 False
2 2019-05-04 2019-05-15 True
3 2019-09-18 2019-09-29 False