Please help understand. I'm reading the documentation sktime WindowSummarizer with exogenous features and y lags example.
import pandas as pd
from sktime.transformations.series.summarize import WindowSummarizer
from sktime.datasets import load_airline, load_longley
from sktime.forecasting.naive import NaiveForecaster
from sktime.forecasting.base import ForecastingHorizon
from sktime.forecasting.compose import ForecastingPipeline
from sktime.forecasting.model_selection import temporal_train_test_split
y = load_airline()
kwargs = {
"lag_feature": {
"lag": [1],
"mean": [[1, 3], [3, 6]],
"std": [[1, 4]],
}
}
Z_train = pd.concat([X_train, y_train], axis=1)
Z_test = pd.concat([X_test, y_test], axis=1)
pipe = ForecastingPipeline(
steps=[
("a", WindowSummarizer(n_jobs=1, target_cols=["POP", "TOTEMP"])),
("b", WindowSummarizer(**kwargs, n_jobs=1, target_cols=["GNP"])),
("forecaster", NaiveForecaster(strategy="drift")),
]
)
pipe_return = pipe.fit(y_train, Z_train)
y_pred2 = pipe_return.predict(fh=fh, X=Z_test)
I have a few questions:
It works with test data if X features are available. However, for future dataset, do I need to add X future values?
I tried RecursiveTabularRegressionForecaster adding different constant values for X, but the y_pred values didn't change. Does this methodology recursively takes lags of X too?