I am trying to get a one-step ahead forecast of an ARIMA model (using the SARIMAX object) for daily stock market data.
This is my code for the model:
df_train.index = pd.DatetimeIndex(df_train.index).to_period('D')
training_mod = sm.tsa.SARIMAX(df_train, order=model.order)
training_res = training_mod.fit()
The index of df_train is:
PeriodIndex(['2018-01-02', '2018-01-03', '2018-01-04', '2018-01-05',
'2018-01-08', '2018-01-09', '2018-01-10', '2018-01-11',
'2018-01-12', '2018-01-16',
...
'2021-09-14', '2021-09-15', '2021-09-16', '2021-09-17',
'2021-09-20', '2021-09-21', '2021-09-22', '2021-09-23',
'2021-09-24', '2021-09-27'],
dtype='period[D]', name='Date', length=941)
Since I am fitting the model to df_train, the forecast method with base arguments should return the forecast for the date '2021-09-28' given that it is daily data.
The problem is that when I try running this line:
training_res.forecast()
It returns this a forecast for the day '2020-07-31':
2020-07-31 0.022581
Freq: D, dtype: float64
I have tried specifying the number of steps in the forecast method.
training_res.forecast(1)
Output:
2020-07-31 0.022581
Freq: D, dtype: float64
training_res.forecast(10)
Output:
2020-07-31 0.022581
2020-08-01 -0.258066
2020-08-02 0.031083
2020-08-03 0.231221
2020-08-04 -0.075070
2020-08-05 -0.197679
2020-08-06 0.108804
2020-08-07 0.160034
2020-08-08 -0.132281
2020-08-09 -0.120677
Freq: D, Name: predicted_mean, dtype: float64
Finally, I have also tried specifying the start date and end date instead of giving a horizon for the forecast, but it gives a new problem:
start_date = pd.to_datetime('2021-09-28')
end_date = pd.to_datetime('2021-10-05')
training_res.forecast(start= start_date, end= end_date)
Output:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[172], line 3
1 start_date = pd.to_datetime('2021-09-28')
2 end_date = pd.to_datetime('2021-10-05')
----> 3 training_res.forecast(start= start_date, end= end_date)
File d:\miniconda3\envs\stocks\lib\site-packages\statsmodels\base\wrapper.py:113, in make_wrapper..wrapper(self, *args, **kwargs)
111 obj = data.wrap_output(func(results, *args, **kwargs), how[0], how[1:])
112 elif how:
--> 113 obj = data.wrap_output(func(results, *args, **kwargs), how)
114 return obj
File d:\miniconda3\envs\stocks\lib\site-packages\statsmodels\tsa\statespace\mlemodel.py:3442, in MLEResults.forecast(self, steps, **kwargs)
3440 else:
3441 end = steps
-> 3442 return self.predict(start=self.nobs, end=end, **kwargs)
TypeError: statsmodels.tsa.statespace.mlemodel.MLEResults.predict() got multiple values for keyword argument 'start'
I don't see where I am giving multiple values for start argument.
Same thing happens when I pass in a period object:
start_date = pd.Period('2021-09-28', freq='D')
end_date = pd.Period('2021-10-05', freq='D')
training_res.forecast(start=start_date, end=end_date)
Output:
TypeError: statsmodels.tsa.statespace.mlemodel.MLEResults.predict() got multiple values for keyword argument 'start'
Also same thing happens when I pass in a string:
training_res.forecast(start= '2021-09-29', end= '2021-10-05')
Output:
TypeError: statsmodels.tsa.statespace.mlemodel.MLEResults.predict() got multiple values for keyword argument 'start'