MNIST and transfer learning with VGG16 in Keras- low validation accuracy

Question

I recently started taking advantage of Keras's flow_from_dataframe() feature for a project, and decided to test it with the MNIST dataset. I have a directory full of the MNIST samples in png format, and a dataframe with the absolute directory for each in one column and the label in the other.

I'm also using transfer learning, importing VGG16 as a base, and adding my own 512 node relu dense layer and 0.5 drop-out before a softmax layer of 10. (For digits 0-9). I'm using rmsprop (lr=1e-4) as the optimizer.

When I launch my environment, it calls the latest version of keras_preprocessing from Git, which has support for absolute directories and capitalized file extensions.

My problem is that I have a very high training accuracy, and a terribly low validation accuracy. By my final epoch (10), I had a training accuracy of 0.94 and a validation accuracy of 0.01.

I'm wondering if there's something fundamentally wrong with my script? With another dataset, I'm even getting NaNs for both my training and validation loss values after epoch 4. (I checked the relevant columns, there aren't any null values!)

Here's my code. I'd be deeply appreciative is someone could glance through it and see if anything jumped out at them.

import pandas as pd
import numpy as np

import keras
from keras_preprocessing.image import ImageDataGenerator

from keras import applications
from keras import optimizers
from keras.models import Model 
from keras.layers import Dropout, Flatten, Dense, GlobalAveragePooling2D
from keras import backend as k 
from keras.callbacks import ModelCheckpoint, CSVLogger

from keras.applications.vgg16 import VGG16, preprocess_input

# INITIALIZE MODEL

img_width, img_height = 32, 32
model = VGG16(weights = 'imagenet', include_top=False, input_shape = (img_width, img_height, 3))

# freeze all layers
for layer in model.layers:
    layer.trainable = False

# Adding custom Layers 
x = model.output
x = Flatten()(x)
x = Dense(512, activation='relu')(x)
x = Dropout(0.5)(x)
predictions = Dense(10, activation="softmax")(x)

# creating the final model 
model_final = Model(input = model.input, output = predictions)

# compile the model 
rms = optimizers.RMSprop(lr=1e-4)
#adadelta = optimizers.Adadelta(lr=0.001, rho=0.5, epsilon=None, decay=0.0)

model_final.compile(loss = "categorical_crossentropy", optimizer = rms, metrics=["accuracy"])

# LOAD AND DEFINE SOURCE DATA

train = pd.read_csv('MNIST_train.csv', index_col=0)
val = pd.read_csv('MNIST_test.csv', index_col=0)

nb_train_samples = 60000
nb_validation_samples = 10000
batch_size = 60
epochs = 10

# Initiate the train and test generators
train_datagen = ImageDataGenerator()
test_datagen = ImageDataGenerator()

train_generator = train_datagen.flow_from_dataframe(dataframe=train,
                                                    directory=None,
                                                    x_col='train_samples',
                                                    y_col='train_labels',
                                                    has_ext=True,
                                                    target_size = (img_height,
                                                                   img_width),
                                                    batch_size = batch_size, 
                                                    class_mode = 'categorical',
                                                    color_mode = 'rgb')

validation_generator = test_datagen.flow_from_dataframe(dataframe=val,
                                                        directory=None,
                                                        x_col='test_samples',
                                                        y_col='test_labels',
                                                        has_ext=True,
                                                        target_size = (img_height, 
                                                                       img_width),
                                                        batch_size = batch_size, 
                                                        class_mode = 'categorical',
                                                        color_mode = 'rgb')

# GET CLASS INDICES
print('****************')
for cls, idx in train_generator.class_indices.items():
    print('Class #{} = {}'.format(idx, cls))
print('****************')

# DEFINE CALLBACKS

path = './chk/epoch_{epoch:02d}-valLoss_{val_loss:.2f}-valAcc_{val_acc:.2f}.hdf5'

chk = ModelCheckpoint(path, monitor = 'val_acc', verbose = 1, save_best_only = True, mode = 'max')

logger = CSVLogger('./chk/training_log.csv', separator = ',', append=False)

nPlus = 1
samples_per_epoch = nb_train_samples * nPlus

# Train the model 
model_final.fit_generator(train_generator,
                          steps_per_epoch = int(samples_per_epoch/batch_size),
                          epochs = epochs,
                          validation_data = validation_generator,
                          validation_steps = int(nb_validation_samples/batch_size),
                          callbacks = [chk, logger])

score 2 · Answer 1 · edited Feb 05 '19 at 04:38

2

Have you tried explicitly defining the classes of the images? as such:

train_generator=image.ImageDataGenerator().flow_from_dataframe(classes=[0,1,2,3,4,5,6,7,8,9])

in both the train and validation generators.

I have found that sometimes the train and validation generators create different correspondence dictionaries.

edited Feb 05 '19 at 04:38

SivolcC

3,258
2
14
32

answered Feb 05 '19 at 01:54

Jair Garza

31
5

2

Good suggestion. This is related to the fact that the internal class mapping seems to depend on the order in which the classes occur in the data. See this question for instance: https://stackoverflow.com/questions/54377389/keras-imagedatagenerator-why-are-the-outputs-of-my-cnn-reversed/54387405#54387405 – sdcbr Feb 05 '19 at 07:09
1

That ended up be the solution I found. Thank you, Jair! When using flow_from_dataframe(), classes MUST be explicitly defined like you showed. I haven't checked the documentation lately, but that should be more explicit. – MahiMai Feb 11 '19 at 16:44

MNIST and transfer learning with VGG16 in Keras- low validation accuracy

1 Answers1

Linked