1

I would like to mark the days in my timeseries (data from china) in an extra column as holiday(boolean true) and non holiday(boolean false).

I am new to this topic and at the moment I am trying to figure out the way how to approach this problem.

I have following days for 2020 as chinese official holidays:

Chinese Holidays 2020

As far as I know, there is no calendar out of the box for china, so I will have to creat a custom calandar as follow:

from pandas.tseries.holiday import Holiday,AbstractHolidayCalendar
    class ChineseHolidays(AbstractHolidayCalendar):
    rules = [Holiday('Chinese New Year', month=1, day=25),
             'Question: How to add more than one day?',
             etc,
            ...]

    cal = ChineseHolidays()

The next steps would be to create the Holidays columns as follows:

holidays = cal.holidays(start=X['timestamp'].min(), end = X['timestamp'].max())

X.assign(Holidays=X['timestamp'].isin(cal.holidays()).astype(int))

My questions here are:

1) Is this in general a proper apporach?

2) How can I define in the line Holiday('Chinese New Year', month=1, day=25) that the days of start from 24th of january and end on 30th of January? Is there a way to define the days off instead of defining just one day?

Thanks for your help.

Best,

B.

Bab
  • 21
  • 4
  • https://github.com/quantopian/trading_calendars – Pygirl Apr 13 '20 at 01:13
  • I suggest you add them on [python-holidays](https://github.com/dr-prodigy/python-holidays)/ The code is pretty easy to understand and that it's the same library used by [fbprophet](https://facebook.github.io/prophet/docs/seasonality,_holiday_effects,_and_regressors.html). If you need any help about open an issue on github. – rpanai Apr 13 '20 at 02:28
  • Thanks veryone. But for the moment doese anyone knwo how to add a holiday in the rules with several days off? (e.g below instead of only 25th, from 24th until 30th? `rules = [Holiday('Chinese New Year', month=1, day=25)]` – Bab Apr 13 '20 at 11:31

2 Answers2

0

Chinese people use lunar calendar. So you can use such lib in python:

pip instal LunarCalendar

import datetime
from lunarcalendar import Converter, Solar, Lunar, DateNotExist

l = Lunar(year=2020, month=1, day=1, isleap=False)
print(Converter.Lunar2Solar(l))

returns canonical 2020-01-25

MaxxxZ
  • 1
0

Looks to me like Pandas has a number of different date methods that support periods and repeating dates.

https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html

They also mention using this for holidays, so I suspect this might be what you're looking for.

Example

In [86]: pd.date_range('2018-01-01', '2018-01-05', periods=5)
Out[86]: 
DatetimeIndex(['2018-01-01', '2018-01-02', '2018-01-03', '2018-01-04',
               '2018-01-05'],
              dtype='datetime64[ns]', freq=None)
Petriborg
  • 2,940
  • 3
  • 28
  • 49