I am getting slightly confused. I am attempting to do a Data Science competition (my first time, on a website similar to Kaggle). I need to do a classification. I have a training set and a test set. Very classic.
I analyzed data and created some new features from the training set (having around 4 additionnal columns). Then I took the training set, and split it into 70/30 in order to extract a "new" training set (70% of the original training set) and a "new" test set (30% of the original training set). I trained my model on the "new" training set (using xGboost) and then tested my model with the"new" test set and managed to have 71% accuracy.
Now my problem is, that I would like to test my model on the initial test set given for the competition. But when I try my usual:
prediction <- predict(xgboost_3_cv_3, test_set_values)
It gives me an error like: Error in eval(predvars, data, env) : object etc. which basically tells me that the new features are not recognized because not in the inial test set "test_set_values". So I am not able to submit my predicition... What am I missing ? Thank you.
ML_Enthousiast