0

I am trying to divide my dataset into three equal parts by using scikit-learn. But when I use StratifiedKFold (on sklearn) to do it, it only shows me the command that I did for partition the dataset, rather than the result:

from sklearn.model_selection import StratifiedKFold
partition = StratifiedKFold(n_splits = 3, shuffle = True, random_state = None)
print(partition)

I am still new with Python libraries, so I am not sure about how to do it.

desertnaut
  • 57,590
  • 26
  • 140
  • 166
  • 1
    Please do not paste code snippets as screenshots, write it down into question. You only initialized the object, but you didn't call the method to get the splits. Look at the example section [here](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.StratifiedKFold.html). – null Apr 14 '21 at 08:29

1 Answers1

0

The second line of your code creates a StratifiedKFold object, it does not really partition your data. It is this object that you should use to split your data (see example below)

partition = StratifiedKFold(n_splits = 3, shuffle = True, random_state = 1)

for train_index, test_index in partition.split(x, y):
    x_train_f, x_test_f = x[train_index], x[test_index]
    y_train_f, y_test_f = y[train_index], y[test_index]

Your answer for splitting your data in 3 parts has been answered here

X_train, X_test, X_validate  = np.split(X, [int(.7*len(X)), int(.8*len(X))])
y_train, y_test, y_validate  = np.split(y, [int(.7*len(y)), int(.8*len(y))])
Rina
  • 149
  • 11