I'm trying to build a CNN capable of detecting COVID-19 through chest x rays. I'm using this kaggle dataset. It has, more or less, 27k images, I'm only using COVID and NORMAL ones.
I first started following keras image classification tutorial, and after some twerks I have something like that:
batch_size = 16
img_height = 160
img_width = 160
img_size = (img_height, img_width)
seed_train_validation = 1
shuffle_value = True
validation_split = 0.3
train_ds = tf.keras.utils.image_dataset_from_directory(
data_dir,
image_size = img_size,
validation_split = validation_split,
subset = "training",
seed = seed_train_validation,
color_mode = "grayscale",
shuffle = shuffle_value
)
val_ds = tf.keras.utils.image_dataset_from_directory(
data_dir,
image_size = img_size,
validation_split = validation_split,
subset = "validation",
seed = seed_train_validation,
color_mode = "grayscale",
shuffle = shuffle_value
)
val_batches = tf.data.experimental.cardinality(val_ds)
test_ds = val_ds.take((2*val_batches) // 3)
val_ds = val_ds.skip((2*val_batches) // 3)
AUTOTUNE = tf.data.AUTOTUNE
train_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)
resize_and_rescale = tf.keras.Sequential([
layers.Resizing(img_height, img_width),
layers.Rescaling(1./255)
])
data_augmentation = tf.keras.Sequential([
layers.RandomFlip("horizontal_and_vertical"),
layers.RandomRotation(0.2),
layers.RandomZoom(0.1)
])
num_classes = len(class_names)
model_1 = Sequential([
resize_and_rescale,
data_augmentation,
layers.Conv2D(16, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Conv2D(32, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Conv2D(64, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Dropout(0.2),
layers.Flatten(),
layers.Dense(128, activation='relu'),
layers.Dense(num_classes)
])
model_1.compile(optimizer="adam",
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
epochs = 75
history = model_1.fit(
train_ds,
validation_data = val_ds,
epochs = epochs
)
If I train for less epochs, let's say 10, when I plot accuracy and loss graps, I got a good exponential curve, howerver, if I increase the number of epochs, I got some weird graphs like these below:
Resultf after training for 75 epochs
I have already introduced data augmentation and a dropout layer, but I dont get better results no matter what. Any tips?
It seems that my model is overfitting, but I dont have much experience to conclude that for sure. However, I read that adding data augmentation and a dropout layer seems to work for most people, but that doesnt seem to be my case.