Truth is that these two models, sm.tsa.statespace.SARIMAX and pm.auto_arima, are not identical in their operations, leading to differences in the model summaries.
I have an amateur solution that allows you to forget about SARIMAX and use autoarima instead, this time it will respect the min and max orders you set which is probably why people prefer to loop through SARIMAX and find the best model themselves and ditch autoarima, follow my answer here at this link: 'start_p' parameter not taking effect in pmd autoarima
The other "solution" would be in the case you just want to loop through different orders, without performing the entire grid search through autoarima.
Then take inspiration from my code below, but be ready to ditch "Stepwise" selection:
predictions_length = int(input("Please input the number of predictions you want to make"))
seasonal_period = int(input("Please input the seasonal period"))
# Prepare your specific orders here
orders_list = [((3, 0, 4), (2, 0, 1, seasonal_period)),
((4, 0, 5), (2, 0, 1, seasonal_period)),
((3, 0, 1), (2, 0, 4, seasonal_period))]
# Initial setup
best_aic = np.inf
best_order = None
best_mae = np.inf
best_model = None
# Loop over the orders
for order in orders_list:
try:
# Create the autoARIMA model object
model = pm.auto_arima(y=model_df['Temperature'],
error_action='ignore',
X=None,
start_p=order[0][0], # Initial value for the autoregressive (AR) order
max_p=order[0][0], # Maximum value for the AR order
d=order[0][1], # Differencing order for the non-seasonal component
max_d=order[0][1], # Maximum value for the differencing order
start_q=order[0][2], # Initial value for the moving average (MA) order
max_q=order[0][2], # Maximum value for the MA or
start_P=order[1][0], # Initial value for the seasonal autoregressive (SAR) order
max_P=order[1][0], # Maximum value for the SAR order
D=order[1][1], # Differencing order for the seasonal component
max_D=order[1][1], # Maximum value for the seasonal differencing order
start_Q=order[1][2], # Initial value for the seasonal moving average (SMA) order
max_Q=order[1][2], # Maximum value for the SMA or
max_order=20, # Maximum total order of the ARIMA model
m=seasonal_period, # Seasonal periodicity (number of periods in each season)
seasonal=True, # Whether to consider seasonality in the model
stationary=True, # Whether the data is already stationary
information_criterion='aic', # Criterion used for model selection
alpha=0.05, # Significance level for hypothesis tests
test='kpss', # Statistical test used to check for stationarity
seasonal_test='ocsb', # Statistical test used to check for seasonality
n_jobs=1, # Number of parallel jobs to run during model fitting, tif >1 it will not print status and it will consume more than 85GB or RAM, if you don't have at least 85GB of RAM, leave it to 1
start_params=None, # Starting parameters for model fitting
trend=trend_, # Trend component of the time series
method='lbfgs', # Optimization method used during model fitting like 'lbfgs' or 'powell'
maxiter=200, # Maximum number of iterations for the optimization method
offset_test_args=None, # Additional arguments for the offset test
seasonal_test_args=None, # Additional arguments for the seasonal test
suppress_warnings=True, # Whether to suppress warnings during model fitting
trace=False, # Whether to print status updates during model fitting
random=False, # Whether to randomize the order of AR terms during parameter selection
stepwise=False, # Whether to use a stepwise approach by Hyndman and Khandakar for parameter selection NOTE: If Stepwise=True then Random needs to be set to False
random_state=None, # Random seed used when random is True
n_fits=1256, # If random=True and a “random search” is going to be performed, n_fits is the number of ARIMA models to be fit.
return_valid_fits=False, # Whether to return all valid fits during selection
out_of_sample_size=predictions_length, # Number of observations to hold out for out-of-sample forecasting
scoring='mae', # Metric used for model selection
scoring_args=None, # Additional arguments for the scoring metric
with_intercept=False, # Whether to include an intercept term in the model
sarimax_kwargs = {
'enforce_stationarity': True,
'enforce_invertibility': True,
'concentrate_scale': False,
'hamilton_representation': False
})
# Additional keyword arguments passed to the SARIMAX model constructor
aic = model.aic()
predictions = model.predict(n_periods=predictions_length)
mae = np.mean(np.abs(predictions - model_df['Temperature'][- predictions_length:])) # compute MAE
print(f'Fitted model with order {order}, AIC: {aic}, MAE: {mae}')
if aic < best_aic and mae < best_mae:
best_aic = aic
best_order = order
best_mae = mae
best_model = model
except:
print(f"Unable to fit model with order {order}")
continue
print('Best model:', best_model)
print('Best order:', best_order)
print('Best AIC:', best_aic)
print('Best MAE:', best_mae)
# Continue with your best model
best_model.plot_diagnostics()
print(best_model.summary())