I am given two data.
Firstly, the train data with known class (target)
Secondly, the test data with no class (no target)
I split the training data into train set and validation set . I oversample the train data and test it on my validation set.
It is an imbalanced dataset.
After picking out the best model, Will I fit it back to my entire dataset for my final prediction on test(unseen data)
Model = LGBMClassifier()
Model.fit(X,Y)
Model.predict (test)
or I fit it on oversample training .
Model = LGBMClassifier()
Model.fit(X_train_smote,Y_train_smote)
Model.predict (test)