I have a dataset of 25544 observations and 7 explanatory variables, that I split in train set and test set. Then I run a GAMGam model with BSplines on the train set.
y = dfop[['RATIO_OPENING']]
X = dfop.loc[:, ~dfop.columns.isin(['MED_RATIO_OPENING','RATIO_OPENING','OD_UNDIR_CITY_PAIR','MONTH'])]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)
x_spline = X_train[['DISTANCE', 'CITY_POP_A','CITY_POP_B','A_GDP_PPP_1990_2015_5arcmin','A_HDI_1990_2015','B_GDP_PPP_1990_2015_5arcmin','B_HDI_1990_2015']]
bs = BSplines(x_spline, df=[3,3,3,3,3,3,3], degree=[2,2,2,2,2,2,2])
poisson = GLMGam(y_train, x_spline, smoother=bs, family=sm.families.Poisson())
poisson_fit = poisson.fit()
I want to predict the dependant variable on the test set.
X_test = X_test[['DISTANCE', 'CITY_POP_A','CITY_POP_B','A_GDP_PPP_1990_2015_5arcmin','A_HDI_1990_2015','B_GDP_PPP_1990_2015_5arcmin','B_HDI_1990_2015']]
results = poisson_fit.predict(exog=X_test, transform=True)
The last line returns the following error.
ValueError: shapes (6386,7) and (21,) not aligned: 7 (dim 1) != 21 (dim 0)
What is the correct syntax for the prediction?