0

I read it from a post that someone said:

For feature scaling, you learn the means and standard deviation of the training set, and then:

  • Standardize the training set using the training set means and standard deviations.
  • Standardize any test set using the training set means and standard deviations.

But now my question is, after fitting a model using scaled training data, should I then apply this fitted model onto scaled or unscaled test data? Thanks!

molbdnilo
  • 64,751
  • 3
  • 43
  • 82
  • I think this is off topic for Stack Overflow. See: [help/on-topic], [ask], [tour]. You can find a discussion on relevant Stack Exchange sites [here](https://meta.stackexchange.com/questions/130524/which-stack-exchange-website-for-machine-learning-and-computational-algorithms). – AMC Feb 16 '20 at 23:25

1 Answers1

0

Yes, you should also scale the test data. If you have scaled your training data and fitted a model to that scaled data, then the test set should also undergo equivalent preprocessing as well. This is standard practice, as it ensures that the model is always provided a data set of consistent form as input.

In Python, the process might look as follows:

scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

There is a detailed write up on this topic on another thread that might be of interest to you.

Jake Tae
  • 1,681
  • 1
  • 8
  • 11