I am working on the "House Prices - Advanced Regression Techniques" machine learning problem. They provide training data and test data. I have to create a model that will predict the house prices of the test set.
There are many features in my train and test set that are categorical. I used pd.get_dummies on my train set to make them all numerical. I also dropped some features, cleaned data, imputed data on my training set.
Once I train my model on this cleaned training data, can I use this same model to test on the Test-data? Keep in mind, I did not clean the test set at all. No one-hot-encoding, or removing columns or cleaning data like I did the training set. So I am assuming the model will not be able to evaluate the test data right?
So do I have to perform the same operations that I did on my training set on my test set as well?