I am using the Nearest Neighbor regression from Scikit-learn in Python with 20 nearest neighbors as the parameter. I trained the model and then saved it using this code:
knn = neighbors.KNeighborsRegressor(n_neighbors, weights='uniform')
knn.fit(trainInputs, trainOutputs)
filename = "KNN_model_%d_%d.sav" % (n_neighbors,windowSize)
pickle.dump(knn, open(filename, 'wb'))
Now I am trying to load the model and predict the output value for a new input using this method:
filename = 'KNN_model_20_720.sav'
loaded_knn_model = pickle.load(open(filename, 'rb'))
nextPrediction = loaded_knn_model.predict(data_pred_input_window)
However, when I do this, I get this error:
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-1-bc1f744a44b3> in <module>()
26 filename = 'KNN_model_20_720_Solar11months.sav'
27 loaded_knn_model = pickle.load(open(filename, 'rb'))
---> 28 nextPrediction = loaded_knn_model.predict(data_pred_input_window)
29
30 print(nextPrediction)
C:\ProgramData\Anaconda3\lib\site-packages\sklearn\neighbors\regression.py in predict(self, X)
142 X = check_array(X, accept_sparse='csr')
143
--> 144 neigh_dist, neigh_ind = self.kneighbors(X)
145
146 weights = _get_weights(neigh_dist, self.weights)
C:\ProgramData\Anaconda3\lib\site-packages\sklearn\neighbors\base.py in kneighbors(self, X, n_neighbors, return_distance)
341 "Expected n_neighbors <= n_samples, "
342 " but n_samples = %d, n_neighbors = %d" %
--> 343 (train_size, n_neighbors)
344 )
345 n_samples, _ = X.shape
ValueError: Expected n_neighbors <= n_samples, but n_samples = 1, n_neighbors = 20
I have no idea why this is happening. I know that I am only giving 1 input for the testing of prediction, but shouldn't that not throw errors because I would assume that the saved model would have saved the historical data to run the knn on? How can I resolve this issue?