0

I am a little confused about a few things, and I was wondering if I could get some help.

  1. the necessity of softmax layers: I thought that for classification models the softmax layer converts creates percentage probabilities for each class in the output, which is necessary for classification. But looking at DenseNet and other pre-made architectures, they don't have any softmax layers, they don't even end in a dense layer, so I just wanna know what I'm missing.

  2. Global average pooling, it must have the same number of channels as the output layer, right? If so, why is it that when I add it, in the model summary it says that I have 1024 channels in the GAP layer, and only 5 in the ending Dense layer?

I know this is kinda long, but I would really appreciate some help :)

1 Answers1

0

Models like DenseNet, MobileNet etc all have a parameters include_top and pooling. If you set include top =False the top layer which is a softmax layer is NOT used. To get the output of the model to be a vector set the parameter pooling='max'. Then the output of the model is a vector that you can feed directly into a dense layer. The Dense layer of course can have a softmax activation function and the number of nodes in the layer should be the same as the number of classes you have

Gerry P
  • 7,662
  • 3
  • 10
  • 20