Image classification in Caffe always returns same class

Question

I have an issue with an image classification in caffe. I use the imagenet model (from the caffe tutorial) for classification of data I created, but I always get the same classification result (same class, i.e. class 3). This is how I proceed:

I use caffe for windows and Python as an interface

(1) I gather the data. My sample-images (training & testing) are images which have a size of 5x5x3 (RGB) uint8, so its pixelvalues reach from 0-255.
(2) I resize them to the size which imagenet requires: 256x256x3. Therefore I use the resize function in matlab (nearest neighbor interpolation).
(3) I create a LevelDB and image_mean.
(4) Train my network (3000 iterations). The only parameters I change in the imagenet definition is the path to the mean image and the LevelDBs.The results I get:

I0428 12:38:04.350100  3236 solver.cpp:245]     Train net output #0: loss = 1.91102 (* 1 = 1.91102 loss)
I0428 12:38:04.350100  3236 sgd_solver.cpp:106] Iteration 2900, lr = 0.0001
I0428 12:38:30.353361  3236 solver.cpp:229] Iteration 2920, loss = 2.18008
I0428 12:38:30.353361  3236 solver.cpp:245]     Train net output #0: loss = 2.18008 (* 1 = 2.18008 loss)
I0428 12:38:30.353361  3236 sgd_solver.cpp:106] Iteration 2920, lr = 0.0001
I0428 12:38:56.351630  3236 solver.cpp:229] Iteration 2940, loss = 1.90925
I0428 12:38:56.351630  3236 solver.cpp:245]     Train net output #0: loss = 1.90925 (* 1 = 1.90925 loss)
I0428 12:38:56.351630  3236 sgd_solver.cpp:106] Iteration 2940, lr = 0.0001
I0428 12:39:22.341891  3236 solver.cpp:229] Iteration 2960, loss = 1.98917
I0428 12:39:22.341891  3236 solver.cpp:245]     Train net output #0: loss = 1.98917 (* 1 = 1.98917 loss)
I0428 12:39:22.341891  3236 sgd_solver.cpp:106] Iteration 2960, lr = 0.0001
I0428 12:39:48.334151  3236 solver.cpp:229] Iteration 2980, loss = 2.45919
I0428 12:39:48.334151  3236 solver.cpp:245]     Train net output #0: loss = 2.45919 (* 1 = 2.45919 loss)
I0428 12:39:48.334151  3236 sgd_solver.cpp:106] Iteration 2980, lr = 0.0001
I0428 12:40:13.040398  3236 solver.cpp:456] Snapshotting to binary proto file Z:/DeepLearning/S1S2/Stockholm/models_iter_3000.caffemodel
I0428 12:40:15.080418  3236 sgd_solver.cpp:273] Snapshotting solver state to binary proto file Z:/DeepLearning/S1S2/Stockholm/models_iter_3000.solverstate
I0428 12:40:15.820426  3236 solver.cpp:318] Iteration 3000, loss = 2.08741
I0428 12:40:15.820426  3236 solver.cpp:338] Iteration 3000, Testing net (#0)
I0428 12:41:50.398375  3236 solver.cpp:406]     Test net output #0: accuracy = 0.11914
I0428 12:41:50.398375  3236 solver.cpp:406]     Test net output #1: loss = 2.71476 (* 1 = 2.71476 loss)
I0428 12:41:50.398375  3236 solver.cpp:323] Optimization Done.
I0428 12:41:50.398375  3236 caffe.cpp:222] Optimization Done.

(5) I run following code in Python to classify a single image:

# set up Python environment: numpy for numerical routines, and matplotlib for plotting
import numpy as np
import matplotlib.pyplot as plt
# display plots in this notebook


# set display defaults
plt.rcParams['figure.figsize'] = (10, 10)        # large images
plt.rcParams['image.interpolation'] = 'nearest'  # don't interpolate: show square pixels
plt.rcParams['image.cmap'] = 'gray'  # use grayscale output rather than a (potentially misleading) color heatmap

# The caffe module needs to be on the Python path;
#  we'll add it here explicitly.
import sys
caffe_root = '../'  # this file should be run from {caffe_root}/examples (otherwise change this line)
sys.path.insert(0, caffe_root + 'python')

import caffe
# If you get "No module named _caffe", either you have not built pycaffe or you have the wrong path.


caffe.set_mode_cpu()

model_def = 'C:/Caffe/caffe-windows-master/models/bvlc_reference_caffenet/deploy.prototxt'
model_weights = 'Z:/DeepLearning/S1S2/Stockholm/models_iter_3000.caffemodel'

net = caffe.Net(model_def,      # defines the structure of the model
                model_weights,  # contains the trained weights
                caffe.TEST)     # use test mode (e.g., don't perform dropout)

#load mean image file and convert it to a .npy file--------------------------------
blob = caffe.proto.caffe_pb2.BlobProto()
data = open('Z:/DeepLearning/S1S2/Stockholm/S1S2train256.binaryproto',"rb").read()
blob.ParseFromString(data)
nparray = caffe.io.blobproto_to_array(blob)
f = file('Z:/DeepLearning/PythonCalssification/imgmean.npy',"wb")
np.save(f,nparray)

f.close()


# load the mean ImageNet image (as distributed with Caffe) for subtraction
mu1 = np.load('Z:/DeepLearning/PythonCalssification/imgmean.npy')
mu1 = mu1.squeeze()
mu = mu1.mean(1).mean(1)  # average over pixels to obtain the mean (BGR) pixel values
print 'mean-subtracted values:', zip('BGR', mu)
print 'mean shape: ',mu1.shape
print 'data shape: ',net.blobs['data'].data.shape

# create transformer for the input called 'data'
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})

# set the size of the input (we can skip this if we're happy

transformer.set_transpose('data', (2,0,1))  # move image channels to outermost dimension
transformer.set_mean('data', mu)            # subtract the dataset-mean value in each channel
transformer.set_raw_scale('data', 255)      # rescale from [0, 1] to [0, 255]
transformer.set_channel_swap('data', (2,1,0))  # swap channels from RGB to BGR

# set the size of the input (we can skip this if we're happy
#  with the default; we can also change it later, e.g., for different batch sizes)
net.blobs['data'].reshape(50,        # batch size
                          3,         # 3-channel (BGR) images
                          227, 227)  # image size is 227x227

#load image
image = caffe.io.load_image('Z:/DeepLearning/PythonCalssification/380.tiff')
transformed_image = transformer.preprocess('data', image)
#plt.imshow(image)

# copy the image data into the memory allocated for the net
net.blobs['data'].data[...] = transformed_image

### perform classification
output = net.forward()

output_prob = output['prob'][0]  # the output probability vector for the first image in the batch

print 'predicted class is:', output_prob.argmax()

It does not matter which input image I use, I always get class "3" as classification result. Here is a sample image I train/classify:

I would be very happy if someone has an idea what is wrong? Thanks in advance!

How much data are you using? How many classes and examples per class? — Dr. Snoopy, Apr 28 '16 at 17:47

score 2 · Answer 1 · answered Apr 28 '16 at 21:03

2

If you always get the same class, it means that the NN was not properly trained.

Make sure that the training set is balanced. When a classifier predicts always the same class, it often because one class is over represented according to the others. For example, let's say that you have two classes, the first represented by 95 instances and the second by 5. If the classifier classifies everything as belonging to the first class, then he's already right at 95%.
One obvious thing is that you should normalize the inputs image / 255.0 - 0.5, it will center the input and decrease the standard deviation.
After, make sure that you have at least 4 times more data into your training set that you have weights in your NN.
Last but not least, make sure that the training set is properly shuffled.

answered Apr 28 '16 at 21:03

FiReTiTi

5,597
12
30
58

I will try to go through your suggestions step by step: 1)I have 8 classes. They are represented by following sample sizes: class 1:918 class 2:897 class 3:922 class 4:799 class 5:69 class 6:277 class 7:718 class 8:691 – Mr M May 02 '16 at 08:48
2)As far as I undersood it, Imagenet requires an image normalization which uses the image/pixel mean. Therefore following steps are performed in the Python code above : transformer.set_transpose('data', (2,0,1)) # move image channels to outermost dimension transformer.set_mean('data', mu) # subtract the dataset-mean value in each channel transformer.set_raw_scale('data', 255) # rescale from [0, 1] to [0, 255] transformer.set_channel_swap('data', (2,1,0)) # swap channels from RGB to BGR. – Mr M May 02 '16 at 08:51
In this steps the image mean is subtracted, the image is scaled to 0-255, the channels are switched since they are loaded in the inverse order, finally, the transpose operation is being performed (I am not 100% sure why this is needed though) – Mr M May 02 '16 at 08:51
3) I have 5291 training sample and 1351 validation samples. I guess that is enough ? 4) I use the caffenet model(based on imagenet). I do not see that there is a shuffle incorporated. I simply assumed that the model definition would work for my purpose. What is the aim of shuffeling? – Mr M May 02 '16 at 08:52
1/ You have to do something at least for the class 6. Check if caffe handles instances with weights, else you have to do over-sampling. 2/ It is highly recommended to have centered data (average 0) with a small standard deviation. 3/ No, it depends on the number of weights that you have to train. You must check it. In my case, I train a CNN with 12K weights, so your dataset would be really to small. 4/ Then shuffle it yourself before training, it will avoid to make the NN diverge/converge for a single class. – FiReTiTi May 02 '16 at 09:15
I think that adding --shuffle to the convert_imageset command did the trick. the accuracy raises to ~50% (from 11%)... but after 4000 iterations the loss jumps from 1-2 to 87.3 all of a sudden. Any idea why?? – Mr M May 03 '16 at 08:32
That jump of the loss to 87.3 seems to appear randomly, also after just only 140 iteration or 1000 iterations. I tried to decrease the batch size from 256 to 128, but things do not seem to change... – Mr M May 03 '16 at 08:40
Do not forget the other steps, they are as important as the shuffling, particularly the 3rd point. I had the same issue of loss jump with Theano, but I never got a clear explanation about such a behavior. I tried to modify the NN architecture until this behavior disappeared. – FiReTiTi May 03 '16 at 17:17
As weird as it sounds: I restarted my PC and now I do not get the loss of 87.3 .My accuracy goes up till 60% now (loss ~0.9), but than drops to 17% again, after iteration 5000. By oversampling, do you mean that I should repeatedly insert my samples from class 6 (and 5?) so that they have an more equal number of Samples to the other classes? Thank you for your help! – Mr M May 04 '16 at 08:01
Yes, you duplicate instances from small classes in order to match the number of instances in the biggest class. – FiReTiTi May 04 '16 at 08:13
1

If you are already augmenting the data, then it might be convenient to add in rotated and scaled variations of your training image set. This procedure will allow it to gain rotation and scale invariance, making it more robust during the classification process. – Salvador Medina May 07 '16 at 10:11

Image classification in Caffe always returns same class

1 Answers1