I have a trained model that uses regression to predict house prices. It was trained on a standardized dataset (StandatdScaler from sklearn). How do I scale my models input (a single example) now in a different python program? I can't use StandardScaler on the input, because all features would be reduced to 0 (MinMaxScaler doesn't work either, also tried saving and loading scaler from the training script - didn't work). So, how can I scale my input so that features won't be 0 allowing the model to predict the price correctly?
-
simple answer: you can't. At least not without the rest of your data. Always use scaling before splitting the data in train/test/validation sets etc. You would have to apply `StandardScaler` to your traindata and the single example altogether – Quastiat Oct 10 '19 at 16:44
-
1Yeah, so the answer to my problem is to use fit and then transform in StandardScaler, rather than fit_transform right away – Kojimba Oct 13 '19 at 12:46
1 Answers
What you've described is a contradiction in terms. Scaling refers to a range of data; a single datum does not have a "range"; it's a point.
What you seem to be asking is how to scale the input data to fit the translation you made when you trained. The answer here is straightforward again: you have to use the same translation function you applied when you trained. Standard practice is to revert the model's ingestion (i.e. reverse that scaling function); if you didn't do that, and you didn't make any note of that function's coefficients, then you do not have the information needed to apply the same translation to future input -- in short, your trained model isn't particularly useful.
You could try to recover the coefficients by running the scaling function on the original data set, making sure to output the resulting function. Then you could apply that function to your input examples.

- 76,765
- 14
- 60
- 81