I want to split my dataset into a non-iid way. can anyone help me? is there any python solution ? I write the code below but don't know how I split in non-iid.
DNA_openml = openml.datasets.get_dataset(40670)
Xy, _, _, _ = DNA_openml.get_data(dataset_format="array")
X = Xy[:, :-1] # the last column contains labels
y = Xy[:, -1]
# First 3000 samples consist of the train set
x_train, y_train = X[:3000], y[:3000]
x_test, y_test = X[3000:], y[3000:]