I am trying to implement a ML algorithm in which I would like to use a 10 fold cross validation process but I would just like to get confirmation if my procedure is correct.
I am doing a binary classification and have about 50 samples of each class in each of the 10 folders that I created, called fold 1
, fold 2
, and so on.
My sklearn
command is:
x_train, x_test, y_train, y_test = train_test_split(X, yy, test_size=0.3, random_state=1000)
Am I totally wrong here and this procedure is actually just doing a 30% test and 70% train process? For the 10 fold cross validation, I should be using:
from sklearn.model_selection import KFold
kf = KFold(n_splits=2, random_state=42, shuffle=True)
Thanks!