Why do I get different results when using the StandardScaler in GridSearchCV?

Question

I want to optimize the hyperparameters of an SVM by GridSearchCV. But the score of the best estimator is very different from the score when run the svm with the best parameters.

#### Hyperparameter search with GridSearchCV###

pipeline = Pipeline([
        ("scaler", StandardScaler()), 
        ("svm", LinearSVC(loss='hinge'))])                      

param_grid=[{'svm__C': c_range}]      

clf = GridSearchCV(pipeline, param_grid=param_grid, cv=5, scoring='accuracy')
clf.fit(X,y)          
print('\n Best score: ',clf.best_score_)


#### scale train and test data  ###

sc = StandardScaler()
sc.fit(X)
X = scaler.transform(X)
X_test = sc.transform(X_test)


###### test best estimator with test data ###################

print("Best estimator score: ", clf.best_estimator_.score(X_test, y_test))


##### run SVM with the best found parameter ##### 

svc = LinearSVC(C=clf.best_params_['svm_C'])
svc.fit(X,y)
print("score with best parameter: ", svc.score(X_test,y_test))

The results are as follows:

Best score: 0.784

Best estimator score: 0.6991

score with best parameter: 0.7968

I don't understand why the scores of the best estimator and the svm are different? Which of these result is the correct test accuracy? Why is the score of the Best estimator with 0.6991 so worse? Have I done something wrong?

Oh sorry. Was just done. By an oversight I only pressed "This Anwer is useful." — Code Now, Oct 15 '19 at 18:36

score 1 · Accepted Answer · answered Oct 15 '19 at 12:52

In the line below:

print("Best estimator score: ", clf.best_estimator_.score(X_test, y_test))

you are passing X_test which is already scaled to clf which is a pipeline which contains another scaler, so essentially you are scaling your data twice as apposed to your last predict statement where you pass your scaled data to svc which just does the model fitting without scaling. So the data fed in both the cases are quite different and so your predictions are also different.

Hope this helps!

Why do I get different results when using the StandardScaler in GridSearchCV?

1 Answers1