0

It is a question more about theory than a problem in code itself. I have the following Pipeline, which will then be used in a GridSearchCV:

my_model = Pipeline([('scaler', MinMaxScaler()), ('model', model())])

cv = GridSearchCV(my_model , parameters, cv=5).fit(X_train, Y_train)

Then, I will use the trained cv with the best hyperparameters to predict on the test set:

cv.predict(X_test)

My questions are as follows:

  1. Will GridSearchCV automatically apply the scaler only to the training set for each fold? That is, follow this logic for each fold:

scaler fit and transform on the train_set_fold (using, of course, the data only for the training set in question) -> train the model -> apply scaler transform on the test_set_fold

  1. When calling cv.predict, will GridSearchCV automatically apply the scaler.transform (learned previously on one of the folds with just the training set) to the set X_test before making the prediction itself?
Nilon
  • 43
  • 6

0 Answers0