I'm using auto_arima
via pmdarima
to fit multiple time series via a groupby
. This is to say, I have a pd.DataFrame
of stacked time-indexed data, grouped by variable variable
, and have successfully applied transform(pm.auto_arima)
to each. The reproducible example finds boring best ARIMA models, but the idea seems to work. I now want to apply .predict()
similarly, but cannot get it to play nice with apply
/ lambda(x)
/ their combinations.
The code below works until the # Forecasting - help!
section. I'm having trouble catching the correct object (apparently) in the apply
. How might I adapt one of test1
, test2
, or test3
to get what I want? Or, is there some other best-practice construct to consider? Is it better across columns (without a melt
)? Or via a loop?
Ultimately, I hope that test1
, say, is a stacked pd.DataFrame
(or pd.Series
at least) with 8 rows: 4 forecasted values for each of the 2 time series in this example, with an identifier column variable
(possibly tacked on after the fact).
import pandas as pd
import pmdarima as pm
import itertools
# Get data - this is OK.
url = 'https://raw.githubusercontent.com/nickdcox/learn-airline-delays/main/delays_2018.csv'
keep = ['arr_flights', 'arr_cancelled']
# Setup data - this is OK.
df = pd.read_csv(url, index_col=0)
df.index = pd.to_datetime(df.index, format = "%Y-%m")
df = df[keep]
df = df.sort_index()
df = df.loc['2018']
df = df.groupby(df.index).sum()
df.reset_index(inplace = True)
df = df.melt(id_vars = 'date', value_vars = df.columns.to_list()[1:])
# Fit auto.arima for each time series - this is OK.
fit = df.groupby('variable')['value'].transform(pm.auto_arima).drop_duplicates()
fit = fit.to_frame(name = 'model')
fit['variable'] = keep
fit.reset_index(drop = True, inplace = True)
# Setup forecasts - this is OK.
max_date = df.date.max()
dr = pd.to_datetime(pd.date_range(max_date, periods = 4 + 1, freq = 'MS').tolist()[1:])
yhat = pd.DataFrame(list(itertools.product(keep, dr)), columns = ['variable', 'date'])
yhat.set_index('date', inplace = True)
# Forecasting - help! - Can't get any of these to work.
def predict_fn(obj):
return(obj.loc[0].predict(4))
predict_fn(fit.loc[fit['variable'] == 'arr_flights']['model']) # Appears to work!
test1 = fit.groupby('variable')['model'].apply(lambda x: x.predict(n_periods = 4)) # Try 1: 'Series' object has no attribute 'predict'.
test2 = fit.groupby('variable')['model'].apply(lambda x: x.loc[0].predict(n_periods = 4)) # Try 2: KeyError
test3 = fit.groupby('variable')['model'].apply(predict_fn) # Try 3: KeyError