The caffe documentation on the softmax_loss_layer.hpp
file seems to be targeted towards classification tasks and not semantic segmentation. However, I have seen this layer being used for the latter.
- What would be the dimensions of the input blobs and output blob in the case where you're classifying each pixel (semantic segmentation)?
- More importantly, how are the equations for calculating the loss applied to these blobs? Like, in what form are the matrices/blobs arranged and the eventual "loss value" that's output, what is the equation for that?
Thank you.
edits: I have referenced this page for understanding concepts of loss equation, just don't know how it's applied to the blobs, which axis, etc.: http://cs231n.github.io/linear-classify/