0

I posted this question some time ago ago on CrossValidated, but no one has been able to answer it yet, so I've decided to post it here just in case:

I'm using auto_arima() function from Python pmdarima library to determine the best ARIMA model.

The results of one of my models are:

SARIMAX Results                                     
=========================================================================================
Dep. Variable:                                 y   No. Observations:                   96
Model:             SARIMAX(2, 1, 1)x(1, 1, 1, 4)   Log Likelihood                -205.932
Date:                           Mon, 27 Jun 2022   AIC                            423.863
Time:                                   15:29:13   BIC                            438.928
Sample:                                        0   HQIC                           429.941
                                            - 96                                         
Covariance Type:                             opg                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
ar.L1         -0.3863      0.167     -2.316      0.021      -0.713      -0.059
ar.L2          0.4234      0.071      5.957      0.000       0.284       0.563
ma.L1          0.4638      0.181      2.562      0.010       0.109       0.819
ar.S.L4        0.6404      0.176      3.644      0.000       0.296       0.985
ma.S.L4       -0.8840      0.139     -6.352      0.000      -1.157      -0.611
sigma2         5.3147      0.620      8.572      0.000       4.100       6.530
===================================================================================
Ljung-Box (L1) (Q):                   0.01   Jarque-Bera (JB):                82.63
Prob(Q):                              0.92   Prob(JB):                         0.00
Heteroskedasticity (H):               3.56   Skew:                            -1.23
Prob(H) (two-sided):                  0.00   Kurtosis:                         6.97
===================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).

I'm familiar with Ljung-Box and Jarque-Bera tests here, and I know how to interpret the heteroskedasticity test results (null hypothesis: homoskedasticity). However, I don't know which specific test is that heteroskedasticity test.

I didn't find this information on pmdarima website.

Any idea about which specific heteroskedasticity test is included in Python pmdarima auto_arima() results?

Thanks!

AlejandroDGR
  • 178
  • 1
  • 10

1 Answers1

1

I've stumbled upon this question while searching for the same question.

Now, I realize this does not answer your specific question - i.e. which test specifically the summary() method shows the results for - but in that example above Prob(H) (two-sided) suggests the same result as the output from the corresponding SARIMAX model from statsmodels, in particular statsmodels.tsa.statespace.sarimax.SARIMAXResults.test_heteroskedasticity() with parameters 'breakvar' and 'two-sided'

From: https://www.statsmodels.org/dev/generated/statsmodels.tsa.statespace.sarimax.SARIMAXResults.test_heteroskedasticity.html

Analogous to a Goldfeld-Quandt test. The null hypothesis is of no heteroskedasticity.

An alternative approach which I am using is to "manually" create the corresponding SARIMAX model from "statsmodels", train it on the entire dataset, and perform the heteroskedacity test on the resulting "residuals" like this:

<sarmimax-model-results>.resid.test_heteroskedasticity('breakvar', 'two-sided')

In my tests I get the same values as from the summary from "pmdarima" (which uses "statsmodels" behind the scenes anyway).

Hope that helps.