Why is my model overfitting on the second epoch?

Question

I'm trying to train a deep learning model to classify different ASL hand signs using Mobilenet_v2 and Inception.

Here are my codes create an ImageDataGenerator for creating the training and validation set.

# Reformat Images and Create Batches

IMAGE_RES = 224
BATCH_SIZE = 32

datagen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=1./255,
    validation_split = 0.4
)

train_generator = datagen.flow_from_directory(
    base_dir,
    target_size = (IMAGE_RES,IMAGE_RES),
    batch_size = BATCH_SIZE,
    subset = 'training'
)

val_generator = datagen.flow_from_directory(
    base_dir,
    target_size= (IMAGE_RES, IMAGE_RES),
    batch_size = BATCH_SIZE,
    subset = 'validation'
)

Here are the codes to train the models:

# Do transfer learning with Tensorflow Hub
URL = "https://tfhub.dev/google/tf2-preview/mobilenet_v2/feature_vector/4"
feature_extractor = hub.KerasLayer(URL,
                                   input_shape=(IMAGE_RES, IMAGE_RES, 3))
# Freeze pre-trained model
feature_extractor.trainable = False

# Attach a classification head
model = tf.keras.Sequential([
  feature_extractor,
  layers.Dense(5, activation='softmax')
])

model.summary()

# Train the model
model.compile(
  optimizer='adam',
  loss='categorical_crossentropy',
  metrics=['accuracy'])

EPOCHS = 5

history = model.fit(train_generator,
                    steps_per_epoch=len(train_generator),
                    epochs=EPOCHS,
                    validation_data = val_generator,
                     validation_steps=len(val_generator)
                    )

Epoch 1/5 94/94 [==============================] - 19s 199ms/step - loss: 0.7333 - accuracy: 0.7730 - val_loss: 0.6276 - val_accuracy:
0.7705

Epoch 2/5
94/94 [==============================] - 18s 190ms/step - loss: 0.1574 - accuracy: 0.9893 - val_loss: 0.5118 - val_accuracy: 0.8145

Epoch 3/5
94/94 [==============================] - 18s 191ms/step - loss: 0.0783 - accuracy: 0.9980 - val_loss: 0.4850 - val_accuracy: 0.8235

Epoch 4/5
94/94 [==============================] - 18s 196ms/step - loss: 0.0492 - accuracy: 0.9997 - val_loss: 0.4541 - val_accuracy: 0.8395

Epoch 5/5
94/94 [==============================] - 18s 193ms/step - loss: 0.0349 - accuracy: 0.9997 - val_loss: 0.4590 - val_accuracy: 0.8365

I've tried using data augmentation but the model still overfits so I'm wondering if I've done something wrong in my code.

What methods of data augmentation are you using? What are the sizes of your train/validation/test sets? — gmiles, Aug 15 '20 at 08:03
Hi, I'm using 70% of my data for training and the other 30% for validation. I've tried using the following for data augmentation: rotation_range=15, width_shift_range=.1, height_shift_range=.1, horizontal_flip = True, zoom_range=0.2. — junsiong2008, Aug 16 '20 at 09:08
After doing data augmentation and training for 10 epochs, my training accuracy is 0.9997 and val_accuracy is 0.8365. — junsiong2008, Aug 16 '20 at 09:12

score 1 · Accepted Answer · answered Aug 14 '20 at 05:10

Your data is very small. Try splitting with random seeds and check if the problem still persists.

If it does, then use regularizations and decrease the complexity of neural network.

Also experiment with different optimizers and smaller learning rate (try lr scheduler)

score 0 · Answer 2 · answered Dec 19 '21 at 15:02

0

It seems like your dataset is very small with some true outputs separated only by a small distance of inputs in the input-output curve. That is why it is fitting easily to those points.

answered Dec 19 '21 at 15:02

Granth

325
4
17

Why is my model overfitting on the second epoch?

2 Answers2