0

I am getting slightly confused. I am attempting to do a Data Science competition (my first time, on a website similar to Kaggle). I need to do a classification. I have a training set and a test set. Very classic.

I analyzed data and created some new features from the training set (having around 4 additionnal columns). Then I took the training set, and split it into 70/30 in order to extract a "new" training set (70% of the original training set) and a "new" test set (30% of the original training set). I trained my model on the "new" training set (using xGboost) and then tested my model with the"new" test set and managed to have 71% accuracy.

Now my problem is, that I would like to test my model on the initial test set given for the competition. But when I try my usual:

prediction <- predict(xgboost_3_cv_3, test_set_values)

It gives me an error like: Error in eval(predvars, data, env) : object etc. which basically tells me that the new features are not recognized because not in the inial test set "test_set_values". So I am not able to submit my predicition... What am I missing ? Thank you.

ML_Enthousiast

ML_Enthousiast
  • 1,147
  • 1
  • 15
  • 39

0 Answers0