I want to optimize the hyperparameters of an SVM by GridSearchCV. But the score of the best estimator is very different from the score when run the svm with the best parameters.
#### Hyperparameter search with GridSearchCV###
pipeline = Pipeline([
("scaler", StandardScaler()),
("svm", LinearSVC(loss='hinge'))])
param_grid=[{'svm__C': c_range}]
clf = GridSearchCV(pipeline, param_grid=param_grid, cv=5, scoring='accuracy')
clf.fit(X,y)
print('\n Best score: ',clf.best_score_)
#### scale train and test data ###
sc = StandardScaler()
sc.fit(X)
X = scaler.transform(X)
X_test = sc.transform(X_test)
###### test best estimator with test data ###################
print("Best estimator score: ", clf.best_estimator_.score(X_test, y_test))
##### run SVM with the best found parameter #####
svc = LinearSVC(C=clf.best_params_['svm_C'])
svc.fit(X,y)
print("score with best parameter: ", svc.score(X_test,y_test))
The results are as follows:
Best score: 0.784
Best estimator score: 0.6991
score with best parameter: 0.7968
I don't understand why the scores of the best estimator and the svm are different? Which of these result is the correct test accuracy? Why is the score of the Best estimator with 0.6991 so worse? Have I done something wrong?