I'm currently working on a 2D CNN in Keras for MRI classification. The class ratio is about 60/40, I have 155 patients, each with one MRI consisting of around 180 slices, the input of the CNN is a slice of an MRI image (256*256 px) (so input in total is ~27900 images, each 256*256 pixels).
I tested different models and always evaluated them with shuffled stratified 10 fold cross validation and an EarlyStopping monitor and they all performed very well, around 95% to 98% validation accuracy. But everytime, one or two folds perform a lot worse then the other ones (70% to 80% validation accuracy). Since the folds are randomized I would expect the folds to all perform equally well.
Can somebody explain how this could happen and how to prevent it?
Plots for accuracy and loss:
Train accuracy and validation accuracy
Train loss and validation loss
This is part of one of the models:
num_classes = 2
img_size = 256
batch_size = 200
# Because of EarlyStopping monitor, the number of epochs doesn't really matter
num_epochs = 1000
kfold_splits = 10
skf = StratifiedKFold(n_splits=kfold_splits, shuffle=True)
# Here the data is split
for index, (train_index, test_index) in enumerate(skf.split(x_data_paths, y_data_paths)):
x_train, x_test = np.array(x_data_paths)[train_index.astype(int)], np.array(x_data_paths)[test_index.astype(int)]
y_train, y_test = np.array(y_data_paths)[train_index.astype(int)], np.array(y_data_paths)[test_index.astype(int)]
training_batch_generator = BcMRISequence(x_train, y_train_one_hot, batch_size)
test_batch_generator = BcMRISequence(x_test, y_test_one_hot, batch_size)
# region Create model (using the functional API)
inputs = Input(shape=(img_size, img_size, 1))
conv1 = Conv2D(64, kernel_size=5, strides=1, activation='relu')(inputs)
pool1 = MaxPooling2D(pool_size=3, strides=(2, 2), padding='valid')(conv1)
conv2 = Conv2D(32, kernel_size=3, activation='relu')(pool1)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
conv3 = Conv2D(16, kernel_size=3, activation='relu')(pool2)
pool3 = MaxPooling2D(pool_size=(2, 2))(conv3)
flat = Flatten()(pool3)
hidden1 = Dense(10, activation='relu')(flat)
output = Dense(num_classes, activation='softmax')(hidden1)
model = Model(inputs=inputs, outputs=output)