5

The Inception v3 model is shown in this image:

Inception v3 Model

The image is from this blog-post:

https://research.googleblog.com/2016/03/train-your-own-image-classifier-with.html

It seems that there are two Softmax classification outputs. Why is that?

Which one is used in the TensorFlow example as the output tensor with the name 'softmax:0' in this file?

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/models/image/imagenet/classify_image.py

The academic paper for the Inception v3 model doesn't seem to have this image of the Inception model:

http://arxiv.org/pdf/1512.00567v3.pdf

I'm trying to understand why there are these two branches of the network with seemingly two different softmax-outputs.

Thanks for any clarification!

OmG
  • 18,337
  • 10
  • 57
  • 90
questiondude
  • 772
  • 7
  • 15

1 Answers1

5

Section 4 of the paper you cite is about auxiliary classifiers. These are classifiers added to the lower levels of the network, that improve training by mitigating the vanishing gradients problem and speedup convergence. For running inference on a trained network, you should use the main classifier, called softmax:0 in the model, and NOT the auxiliary classifier, called auxiliary_softmax:0.

keveman
  • 8,427
  • 1
  • 38
  • 46
  • I had read section 4 in that paper just prior to asking the question and did wonder if that was the explanation. But I got the impression from section 4 that it wasn't very useful so I assumed it wasn't included in the Inception model. Thanks for clearing it up! – questiondude Sep 07 '16 at 06:43