Below is my last layer in training net:
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "final"
bottom: "label"
top: "loss"
loss_param {
ignore_label: 255
normalization: VALID
}
}
Note I adopt a softmax_loss layer. Since its calculation form is like: - log (probability), it's weird the loss can be negative, as shown below(iteration 80).
I0404 23:32:49.400624 6903 solver.cpp:228] Iteration 79, loss = 0.167006
I0404 23:32:49.400806 6903 solver.cpp:244] Train net output #0: loss = 0.167008 (* 1 = 0.167008 loss)
I0404 23:32:49.400825 6903 sgd_solver.cpp:106] Iteration 79, lr = 0.0001
I0404 23:33:25.660655 6903 solver.cpp:228] Iteration 80, loss = -1.54972e-06
I0404 23:33:25.660845 6903 solver.cpp:244] Train net output #0: loss = 0 (* 1 = 0 loss)
I0404 23:33:25.660862 6903 sgd_solver.cpp:106] Iteration 80, lr = 0.0001
I0404 23:34:00.451464 6903 solver.cpp:228] Iteration 81, loss = 1.89034
I0404 23:34:00.451661 6903 solver.cpp:244] Train net output #0: loss = 1.89034 (* 1 = 1.89034 loss)
Can anyone explain it for me? How can this happened? Thank you very much!
PS: The task I do here is semantic segmentation. There are 20 object classes plus background in total(So 21 classes). The label range from 0-21. The extra label 225 is ignored which can be find in SoftmaxWithLoss definition at the beginning of this post.