3

Model Fits but the Predictions Fail

Using a (4,0,13) ARIMA model on the following data shown in the picture below yields flat predictions (also shown shown in the second picture below). I am not sure why the model can fit the data in the training set, but then predict nothing afterwards. I found another question here which said I needed to add a seasonal component. I detail my experience with that below.

The Time Series (zoomed in)

The Predictions*

* The predictions plot shows all the training data as well as the validation data after the orange vertical line. The training fit is rounded to be integers (it's not possible to have real numbers in this dataset). Note the prediction is just flat and then dies.

Problem Definition

I have 15 minute interval data and desire to apply a SARIMA model to it. It has a daily seasonality, which is defined from 7am-9pm (therefore, every 4 * 15 = 60 periods (4, 15 minute periods in an hour * 15 hours)). I first tested for stationarity with the Augmented Dickey-Fuller test. This passed, and so I started to analyze the ACF and PACF to determine the SARIMA parameters.

Parameter Determination

(p,d,q)

ACF & PACF on Original Data

From this, I see there is no unit root (sum of ACF and PACF do not equal 1), and that we need to difference the series since there is no big cut off in the ACF.

ACF & PACF on Differenced Data

From this, I see it is slightly overdifferenced, so I may want to try no integrated term and add an AR term at 15 (the point where the ACF in the original plot enters the bands). I also add an MA term here.

(P,D,Q)s

I now look for the seasonal component. I do a seasonal difference of period 60 since that's where the spike is in the plots.

Seasonal difference

Seeing this, I should add 2 MA terms to the seasonal component (Rules 13 and 7 from here) But the site also says to not use more than 1 seasonal MA usually, so I leave it at 1.

Model

This leaves me with a SARIMA(0,1,1)(0,1,1,60) model. However, I run out of memory trying to fit this model (Python, using the statsmodels SARIMA function).

Question

Did I choose the parameters correctly? Is this data even fittable by ARIMA/SARIMA? And lastly, would the 60 period SARIMA actually work and I just need to find a way to run it on a different machine?

I guess the tl;dr question is: what am I doing wrong?

Feel free to go into detail. I want to become well informed with time series and so more information is better!

SgtRevan
  • 31
  • 3

1 Answers1

0

to select the best fit model, you use the AIC/BIC test to find the model that receives best results. You test different combination of Q and P.

Further,normally the model follows rule: q+d+p+Q+D+P < 6

BR A.

Andy
  • 1
  • 2