0

I am trying to create a seasonal ARIMA (SARIMA) model using pmdarima's AutoARIMA. The reason for that is that new data will become available over the lifetime of the project and code is required which automatically finds the best timeseries model. Unfortunately my current code seems to be producing garbage:

import pmdarima as pm
import pandas as pd


train_data = pd.read_csv("test.csv", header=None, names=["Value"])["Value"]
model = pm.AutoARIMA(seasonal=True, m=168, trace=True)
model.fit(train_data.fillna(0))

test.csv

Output (so far, after quite some time on large server):

Performing stepwise search to minimize aic
 ARIMA(2,1,2)(1,0,1)[168] intercept   : AIC=inf, Time=4041.19 sec
 ARIMA(0,1,0)(0,0,0)[168] intercept   : AIC=-35451.160, Time=1.07 sec
 ARIMA(1,1,0)(1,0,0)[168] intercept   : AIC=inf, Time=15118.06 sec
 ARIMA(0,1,1)(0,0,1)[168] intercept   : AIC=-35951.886, Time=3805.77 sec
 ARIMA(0,1,0)(0,0,0)[168]             : AIC=-35453.123, Time=0.56 sec
 ARIMA(0,1,1)(0,0,0)[168] intercept   : AIC=-35723.198, Time=2.69 sec
 ARIMA(0,1,1)(1,0,1)[168] intercept   : AIC=inf, Time=61326.67 sec
 ARIMA(0,1,1)(0,0,2)[168] intercept   : AIC=inf, Time=39971.60 sec
 ARIMA(0,1,1)(1,0,0)[168] intercept   : AIC=-36054.745, Time=4211.60 sec
 ARIMA(0,1,1)(2,0,0)[168] intercept   : AIC=-36344.782, Time=30668.84 sec

The data has two seasonal patterns (one daily and one weekly). Including a daily pattern gives sensible results (using m=24), but weekly tends to cause AIC=inf as in the example above.

C Hecht
  • 932
  • 5
  • 14
  • Is it normal that it takes that long? – Tobitor Feb 17 '21 at 12:07
  • The dataset is relatively large, so the long operating time was to be expected. But it could be very different in your case. – C Hecht Feb 18 '21 at 12:50
  • Which machine did you use? I have a very similar dataset with also a big seasonality. Because of the long computation time I switched to Facebook Prophet which is way, way faster. – Tobitor Feb 18 '21 at 13:07
  • 1
    The machine has 32 cores with about 64 threads and about 250GB RAM. I will probably also go for fb prophet, but wanted to have SARIMA as a backup option – C Hecht Feb 19 '21 at 12:01

1 Answers1

1

The issue seems to have been that pmdarima times out after some time and inserts an AIC of inf as a replacement for the non-calculated AIC. I ended up doing conventional analysis and going for a slightly oversized SARIMA model which takes longer to fit, but definitely includes all relevant effects.

C Hecht
  • 932
  • 5
  • 14