I still confuse about data validation workflow. As I understand, when I get a dataset, I split the data into two parts, training set and test set, using train_test_split
. Then, I perform cross_val_score
orcross_val_predict
on training set for model selection and hyperparameter tuning. Then, I perform the selected model on test set to see the model performance. Am I understand correctly ? or I can perform cross_val_score
and cross_val_predict
on the entire dataset without using train_test_split
Asked
Active
Viewed 213 times
-2

indyspace
- 1
- 1
-
Please share as much details as possible while asking a question. What search have you done before asking this question ? Can you share any links where you have tried to search ? Which framework are you working with ? What is your dev environment. – Manoj Aug 27 '20 at 04:35
1 Answers
0
Yes , One can use cross_val_score /cross_val_predict for model selection and parameter tuning. It also lets you choose metrics you want to judge the model on. So you basically choose your model and parameters after cross validation results and see if it generalizes well with test data and real world data.

war_wick
- 37
- 4