CNN Keras Dog cat classification if image contain both dog and cat

Question

I modified dog cat binary classification to make multi-class activation by using Sigmoid activation function in output layer to get prediction for each individual class, but it failed to get expected results.

I created a image which have both dog and cat in single image.

Expected Result: Dog : 70% or more than 70%, Cat : 70% or more than 70%

Actual Results: Dog : 70 %, Cat : 25 %

Why is it not predicting individual class with high accuracy?

import numpy as np
from keras.models import Sequential;
from keras.layers import Conv2D, MaxPooling2D, Dense, Flatten;
from keras.utils.np_utils import to_categorical
from keras import optimizers

classifier = Sequential();

classifier.add(Conv2D(32,(3,3),input_shape=(64,64,3),activation='relu'));
classifier.add(MaxPooling2D((2,2)));

classifier.add(Conv2D(32,(3,3),activation='relu'));
classifier.add(MaxPooling2D((2,2)));

classifier.add(Conv2D(32,(3,3),activation='relu'));
classifier.add(MaxPooling2D((2,2)));

classifier.add(Flatten());

classifier.add(Dense(100, activation='sigmoid'))
classifier.add(Dense(2,activation='sigmoid'));

classifier.compile(optimizer="adam", loss='categorical_crossentropy', metrics=['accuracy'])

from keras.preprocessing.image import ImageDataGenerator

trainingDataOptions = ImageDataGenerator(rescale=1./255,shear_range=0.2,zoom_range=0.2,horizontal_flip=True)
testingDataOptions = ImageDataGenerator(rescale=1./255)

trainingData = trainingDataOptions.flow_from_directory('dataset/training',target_size=(64,64),batch_size=32);
testingData = testingDataOptions.flow_from_directory('dataset/testing',target_size=(64,64),batch_size=32);

classifier.fit_generator(trainingData, samples_per_epoch=1757, nb_epoch=10, validation_data=testingData, nb_val_samples=308)

classifier.save('model.h5')

# Output
from keras.preprocessing import image
test_image = image.load_img('samples/319b5fa.jpg',target_size=(64,64))
test_image = image.img_to_array(test_image)
test_image = np.expand_dims(test_image, axis=0)

res = classifier.predict(test_image)
label_map = (trainingData.class_indices)

print(res);

i = 0;
for label in label_map:
    score = res[0][i]
    score = score*100
    score = "{0:.0f}".format(score)
    print(label,"====>",score,'%');
    i = i+1;

I did not use softmax in output layer, so why sum of individual predictions not going more than 100%? It is keeping multi-class classification always under 1.0 (which I guess it has to do with softmax where it distribute probability).

What kind of accuracy do you get on original test set (with just one cat or dog per image)? Is it close to your original softmax based classifier? — dgumo, Jul 15 '18 at 06:59
I achived 87% val_accuracy on training at last epoch, which is quit satisfactory for me to testing with sample do cat images. — Inderjit Singh Sidhu, Jul 15 '18 at 07:21
You last two dense layers have sigmoid activations. Only the last layer should have it (which scales de output to [0, 1] and that is why it never gets past that), the previous layer should use linear/relu activation. — Imanol Luengo, Jul 15 '18 at 07:34
I tried to use relu before last sigmoid layer, but then it results in most cases [0,0] — Inderjit Singh Sidhu, Jul 15 '18 at 08:03
When i add relu activation before sigmoid, it start resulting like [[0. 1.]], in true binary form — Inderjit Singh Sidhu, Jul 15 '18 at 08:32

score 0 · Accepted Answer · answered Jul 15 '18 at 07:17

0

To be precise, you want to do multi-label multi-class classification or simply multi-label classification (i.e. each image may belong to zero, one or multiple classes, for example one image may have both cat and dog as you mentioned). So the choice of sigmoid activation for the Dense layer is correct since each class is independent of another and should get a value from 0 to 1 (i.e. corresponds to probability).

However, you must also change the loss from 'categorical_crossentropy' to 'binary_crossentropy' since here you are no longer doing single-label multi-class classification (i.e. which one of dog or cat exist in this image?). Whereas, you are performing a bunch of binary classifications (i.e. Does cat exist or not? Does dog exist or not?). And the appropriate loss for this scenario would be 'binary_crossentropy'.

answered Jul 15 '18 at 07:17

today

32,602
8
95
115

As per your suggestion i changed loss="binary_crossentrop" now results are [[0.00503693 0.9966628 ]] cat ====> 1 % dog ====> 100 % so this is predicting almost near to 1 for both classes as combined, not individual, where it should be both at high accuracy of % as expected results. – Inderjit Singh Sidhu Jul 15 '18 at 08:16
@InderjitSinghSidhu Have you trained the network from scratch after this change? You must train it again to adjust the weights appropriately, specially the weights of the last two Dense layers. – today Jul 15 '18 at 08:20
@InderjitSinghSidhu Plus, if you don't train on samples where both dog and cat exist, then you might not get high accuracy on the test samples of the same kind. – today Jul 15 '18 at 08:24
Yes, i retrained after changing loss=binary_crossentrop, do you think we need to trained this model with both dog and cat in one picture. then what it means multiclass classification i have seen clarifai.com they predict % of each object from single image, this is what i am trying to achieve if there is an image which contain both dog and cat, it should give prediction like 80% dog, 75 % cat something like this. – Inderjit Singh Sidhu Jul 15 '18 at 08:37
1

@InderjitSinghSidhu If you retrained the model then you must get better accuracy at least (even if you have not trained it on images that contain both cat and dog in them). One more thing: in your code you have not rescaled your test image. You must rescale before predicttion: `test_image = test_image.astype('float32') / 255.0`. – today Jul 15 '18 at 08:45
Hey bro loves you, test_image = test_image.astype('float32') / 255.0 this really improved my results on prediction, now i feel in better to improve this model. – Inderjit Singh Sidhu Jul 15 '18 at 09:08
One more question: i am using in last two layers classifier.add( Dense(output_dim=128, activation = 'sigmoid') ) classifier.add( Dense(output_dim=2, activation = 'sigmoid') ) many suggested me to use "relu" before last Dense layer. but when i use it gives me result in binary [0,1] instead of expected sigmoid results between 0-1 range what could be the problem? – Inderjit Singh Sidhu Jul 15 '18 at 09:12
@InderjitSinghSidhu Yeah, use relu in this layer: `Dense(output_dim=128, activation = 'relu')` (and not the last layer). Are you sure? It is highly unlikely that the output of the last layer be exactly zero or one with sigmoid activation. Are they correct at least? For example if it predicts `[1,1]` does the image contains both cat and dog? I mean what is the accuracy of the mode? – today Jul 15 '18 at 09:38
yes before last layer i added relu, and last layer is sigmoid. results are better but still they are not predicting to give more than 50% for each individual if dog and cat both in same image, any suggestion? – Inderjit Singh Sidhu Jul 15 '18 at 09:49
2

@InderjitSinghSidhu 1) increase the number of filters in the last conv layer to 64 , and 2) train on images that have both cat and dog in them (otherwise the network may not have a correct idea of how an image with **both** cat and dogs in it looks like). – today Jul 15 '18 at 09:52

CNN Keras Dog cat classification if image contain both dog and cat

1 Answers1