Categorical image classification always predicts one class, though calculated accuracy reaches 100%

Question

I followed the Keras cat/dog image classification tutorial Keras Image Classification tutorial and found similar results to the reported values. I then took the code from the first example in that tutorial Tutorial Example 1 code, slightly altered a few lines, and trained the model for a dataset of grayscale images (~150 thousand images across 7 classes).

This gave me great initial results ( ~84% accuracy), which I am happy with.

Next I tried implementing the image batch generator myself, which is where I am having trouble. Briefly, the code seems to run well, except the reported accuracy of the model quickly shoots to >= 99% within two epochs. Due to noise in the dataset, this amount of accuracy is not believable. After using the trained model to predict a new batch of data ( images outside of the training or validation dataset ), I find the model always predicts the first class ( i.e. [1.,0.,0.,0.,0.,0.,0.]. The loss function is forcing the model to predict a single class 100% of the time, even though the labels I pass in are distributed across all the classes.

After 28 epochs of training, I see the following output:

320/320 [==============================] - 1114s - loss: 1.5820e-07 - categorical_accuracy: 1.0000 - sparse_categorical_accuracy: 0.0000e+00 - val_loss: 16.1181 - val_categorical_accuracy: 0.0000e+00 - val_sparse_categorical_accuracy: 0.0000e+00

When I examine the batch generator output from the tutorial code, and compare my batch generator output, the shape, datatype, and range of values are identical between both generators. I would like to emphasize that the generator passes y labels from each category, not just array([ 1.., 0., 0., 0., 0., 0., 0.], dtype=float32). Therefore, I am lost as to what I am doing incorrectly.

Since I posted this code several days ago, I have used the default Keras image generator, and successfully trained the network on the same dataset and same network architecture. Therefore, something about how I load and pass the data in the generator must be incorrect.

Here is the code I implemented:

from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense
from keras.optimizers import SGD
from keras.callbacks import ModelCheckpoint, EarlyStopping, ReduceLROnPlateau
import imgaug as ia
from imgaug import augmenters as iaa
import numpy as np
import numpy.random as nprand
import imageio
import os, re, random, sys, csv
import scipy

img_width, img_height = 112, 112
input_shape = (img_width,img_height,1)
batch_size = 200
epochs = 2

train_image_directory = '/PATH/To/Directory/train/'
valid_image_directory = '/PATH/To/Directory/validate/'
video_info_file = '/PATH/To/Directory/train_labels.csv'
train_image_paths = [train_image_directory + m.group(1) for m in [re.match(r"(\d+_\d+\.png)", fname) for fname in os.listdir(train_image_directory)] if m is not None]
valid_image_paths = [valid_image_directory + m.group(1) for m in [re.match(r"(\d+_\d+\.png)", fname) for fname in os.listdir(valid_image_directory)] if m is not None]

num_train_images = len(train_image_paths)
num_val_images = len(valid_image_paths)
label_map = {}
label_decode = {
        '0': [1.,0.,0.,0.,0.,0.,0.],
        '1': [0.,1.,0.,0.,0.,0.,0.],
        '2': [0.,0.,1.,0.,0.,0.,0.],
        '3': [0.,0.,0.,1.,0.,0.,0.],
        '4': [0.,0.,0.,0.,1.,0.,0.],
        '5': [0.,0.,0.,0.,0.,1.,0.],
        '6': [0.,0.,0.,0.,0.,0.,1.]
        }

with open(video_info_file) as f:
        reader = csv.reader(f)
        for row in reader:
                key = row[0]
                if key in label_map:
                        pass
                label_map[key] = label_decode[row[1]]

sometimes = lambda aug: iaa.Sometimes(0.5,aug)

seq = iaa.Sequential(
        [
        iaa.Fliplr(0.5),
        iaa.Flipud(0.2),
        sometimes(iaa.Crop(percent=(0, 0.1))),
        sometimes(iaa.Affine(
                scale={"x": (0.8, 1.2), "y": (0.8, 1.2)},
                translate_percent={"x": (-0.2, 0.2), "y": (-0.2, 0.2)},
                rotate=(-5, 5),
                shear=(-16, 16),
                order=[0, 1],
                cval=(0, 1),
                mode=ia.ALL
                )),
        iaa.SomeOf((0, 3),
                    [
                        sometimes(iaa.Superpixels(p_replace=(0, 0.40), n_segments=(20, 100))),

                        iaa.Sharpen(alpha=(0, 1.0), lightness=(0.75, 1.5)),
                        iaa.Emboss(alpha=(0, 1.0), strength=(0, 1.0)),
                        iaa.AdditiveGaussianNoise(loc=0, scale=(0.0, 0.05*255)),
                        iaa.OneOf([
                            iaa.Dropout((0.01, 0.1)),
                            iaa.CoarseDropout((0.03, 0.15), size_percent=(0.02, 0.05)),
                        ]),
                        iaa.Invert(0.05),
                        iaa.Add((-10, 10)),
                        iaa.Multiply((0.5, 1.5), per_channel=0.5),
                        iaa.ContrastNormalization((0.5, 2.0)),
                        sometimes(iaa.ElasticTransformation(alpha=(0.5, 1.5), sigma=0.2)),
                        sometimes(iaa.PiecewiseAffine(scale=(0.01, 0.03))) # sometimes move parts of the image around
                    ],
                    random_order=True
                )
        ],
        random_order=True)


def image_data_generator(image_paths, labels, batch_size, training):
        while(1):
                image_paths = nprand.choice(image_paths, batch_size)
                X0 = np.asarray([imageio.imread(x) for x in image_paths])
                Y = np.asarray([labels[x] for x in image_paths],dtype=np.float32)
                if(training):
                        X = np.divide(np.expand_dims(seq.augment_images(X0)[:,:,:,0],axis=3),255.)
                else:
                        X = np.expand_dims(np.divide(X0[:,:,:,0],255.),axis=3)
                X = np.asarray(X,dtype=np.float32)
                yield X,Y

def predict_videos(model,video_paths):
        i=0
        predictions=[]
        while(i < len(video_paths)):
                video_reader = imageio.get_reader(video_paths[i])
                X0 = np.expand_dims([ im[:,:,0] for x,im in enumerate(video_reader) ],axis=3)
                prediction = model.predict(X0)
                i=i+1
                predictions.append(prediction)
        return predictions

train_gen = image_data_generator(train_image_paths,label_map,batch_size,True)
val_gen = image_data_generator(valid_image_paths,label_map,batch_size,False)

model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=input_shape))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.4))
model.add(Dense(7))
model.add(Activation('softmax'))

model.load_weights('/PATH/To_pretrained_weights/pretrained_model.h5')

sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy',
              optimizer='sgd',
              metrics=['categorical_accuracy','sparse_categorical_accuracy'])

checkpointer = ModelCheckpoint('/PATH/To_pretrained_weights/pretrained_model.h5', monitor='val_loss', verbose=0, save_best_only=True, save_weights_only=False, mode='auto', period=1)
reduceLR = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=20, verbose=0, mode='auto', cooldown=0, min_lr=0)
early_stop = EarlyStopping(monitor='val_loss', patience=20, verbose=1)
callbacks_list = [checkpointer, early_stop, reduceLR]

model.fit_generator(
        train_gen,
        steps_per_epoch = -(-num_train_images // batch_size),
        epochs=epochs,
        validation_data=val_gen,
        validation_steps = -(-num_val_images // batch_size),
        callbacks=callbacks_list)

Well, if you see a suspiciously high accuracy during training and low performance during validation, that's usually a sign of overfitting. Check your dataset that goes in, make sure it's all valid. If your dataset is being generated and labeled as expected, I'd start looking for overfitting correction. Increasing Dropout, adding in some l1/l2, adjusting batch size/epochs/learning rate and/or grooming the data a bit better. — Araymer, Jul 10 '17 at 19:37
The dataset is not overfitted, the model is being trained to predict only one class. This is a error in my code I think, not a training problem. I was able to train successfully on the same dataset, with the same parameters using the default Keras ImageDataGenerator. — JG_Maine, Jul 10 '17 at 19:45
Like I said, verify your dataset first. You might only be assigning one label, or handing it in bad data. If those aren't true, it's probably overfitting. If you jump to near-100% training accuracy, it's entirely possible it might always predict one class on validation data. Not likely, but possible. — Araymer, Jul 10 '17 at 19:54
I called `x,y = next(train_gen)` and examined the x and y variables. y contains all the labels in a roughy equal frequency, and x is correctly augmented images. When I try to predict a batch of training images, it gives the wrong classification, even if the accuracy of the model says "100%" — JG_Maine, Jul 10 '17 at 19:59

score 0 · Answer 1 · answered Aug 07 '17 at 20:42

For some reason that I cannot fully determine, if you do not give the fit_generator function accurate numbers for steps per epoch or steps for validation, the result is inaccurate reporting of the accuracy metric and strange gradient descent steps.

You can fix this problem by using the Train_on_batch function in Keras instead of the fit generator, or by accurately reporting these step numbers.

Categorical image classification always predicts one class, though calculated accuracy reaches 100%

1 Answers1