I'm trying to predict temperature at 12 UTC tomorrow in 1 location. To forecast, I use a basic linear regression model with the statmodels module. My code is hereafter:
x = ds_main
X = sm.add_constant(x)
y = ds_target_t
model = sm.OLS(y,X,missing='drop')
results = model.fit()
The summary shows that the fit is "good":
But the problem appears when I try to predict values with a new dataset that I consider to be my testset. The latter has the same columns number and the same variables names, but the .predict() function returns an array of NaN, although my testset has values ...
xnew = ts_main
Xnew = sm.add_constant(xnew)
ynewpred = results.predict(Xnew)
I really don't understand where the problem is ...
UPDATE : I think I have an explanation: my Xnew dataframe contains NaN values. Statmodels function .fit() allows to drop missing values (NaN) but not .predict() function. Thus, it returns a NaN values array ...
But this is the "why", but I still don't get the "how" reason to fix it...