I'm using a custom tensorflow model for an imbalanced classification problem. For this I need to split up the data in a train and test set and split the train set into batches. However the batches need to be stratified due to the imbalance problem. For now I'm doing it like this:
X_train, X_test, y_train, y_test = skmodel.train_test_split(
Xscaled, y_new, test_size=0.2, stratify=y_new)
dataset = tf.data.Dataset.from_tensor_slices((X_train, y_train)).shuffle(
X_train.shape[0]).batch(batch_size)
But I am not sure if the batches in dataset are stratified or not? If not, how can I make sure that they are stratified?