Any idea about why our training loss is smooth and our validation loss is that noisy (see the link) across epochs? We are implementing a deep learning model for diabetic retinopathy detection (binary classification) using the data set of fundus photographs provided by this Kaggle competition. We are using Keras 2.0 with Tensorflow backend.
As the data set is too big to fit in memory, we are using fit_generator
, with ImageDataGenerator
randomly taking images from training and validation folders:
# TRAIN THE MODEL
model.fit_generator(
train_generator,
steps_per_epoch= train_generator.samples // training_batch_size,
epochs=int(config['training']['epochs']),
validation_data=validation_generator,
validation_steps= validation_generator.samples // validation_batch_size,
class_weight=None)
Our CNN architecture is VGG16 with dropout = 0.5 in the last two fully connected layers, batch normalization only before the first fully connected layer, and data augmentation (consisting on flipping the images horizontally and vertically). Our training and validation samples are normalized using the training set mean and standard deviation. Batch size is 32. Our activation is a sigmoid
and the loss function is the binary_crossentropy
. You can find our implementation in Github
It definitely has nothing to do with overfitting, as we tried with a highly regularized model and the behavior was quite the same. Is it related with the sampling from the validation set? Has any of you had a similar problem before?
Thanks!!