How do I train the VGG network when multiple classes are present in one training example?

Question

I recently switched to TFlearn to get some calculations on my network accuracy for classifying images and creating state of the art results. I am using this exact file from TFlearn, except for the dataset. I wanted to reproduce the accuracy for the VOC2007 dataset, downloaded all images and groundtruth and wrote a function that creates a 4D Tensor containing all images and a 2D Tensor containing all class indices. Their shapes are [?, 224, 224, 3] and [?, 20] respectively. Now i noticed, that the class indices are not one-hot labels, but several classes can be present in one image. Since TFlearn allows more than one class to be present, the network performs really badly (accuracy ~30%, and yes, i changed the number of output classes). I wonder, how to get around this problem. Should i only allow one class per image? But then, if there are two classes in an image and i classify it correctly according to the second class, this would be a correct detection that i would missclassify as an error. Is there any option that i am missing? I don't see a "one-hot" option or anything similar (like in the oxflowers dataset).

Thanks for your help!

What is the loss function you are using for your multi-label problem? — Vijay Mariappan, Jun 25 '17 at 19:25
I am using the standard `loss='categorical_crossentropy'` as the loss function. — Martin, Jun 26 '17 at 09:18
Are you using the 'sigmoid' activation for the last layer? since its a multi-label problem — Vijay Mariappan, Jun 26 '17 at 15:49
Yes, I did, but I also used softmax in order to get a distributed output. Either way i ended up with bad results. — Martin, Jul 05 '17 at 09:05
you can't use softmax for a `multilabel` problem: Multiple outputs needs to output 1. — Vijay Mariappan, Jul 05 '17 at 09:08

How do I train the VGG network when multiple classes are present in one training example?

0 Answers0