caffe output the negative loss value with SoftmaxWithLoss layer?

Question

Below is my last layer in training net:

layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "final"
  bottom: "label"
  top: "loss"
  loss_param {
    ignore_label: 255
    normalization: VALID
  }
}

Note I adopt a softmax_loss layer. Since its calculation form is like: - log (probability), it's weird the loss can be negative, as shown below(iteration 80).

I0404 23:32:49.400624  6903 solver.cpp:228] Iteration 79, loss = 0.167006
I0404 23:32:49.400806  6903 solver.cpp:244]     Train net output #0: loss = 0.167008 (* 1 = 0.167008 loss)
I0404 23:32:49.400825  6903 sgd_solver.cpp:106] Iteration 79, lr = 0.0001
I0404 23:33:25.660655  6903 solver.cpp:228] Iteration 80, loss = -1.54972e-06
I0404 23:33:25.660845  6903 solver.cpp:244]     Train net output #0: loss = 0 (* 1 = 0 loss)
I0404 23:33:25.660862  6903 sgd_solver.cpp:106] Iteration 80, lr = 0.0001
I0404 23:34:00.451464  6903 solver.cpp:228] Iteration 81, loss = 1.89034
I0404 23:34:00.451661  6903 solver.cpp:244]     Train net output #0: loss = 1.89034 (* 1 = 1.89034 loss)

Can anyone explain it for me? How can this happened? Thank you very much!

PS: The task I do here is semantic segmentation. There are 20 object classes plus background in total(So 21 classes). The label range from 0-21. The extra label 225 is ignored which can be find in SoftmaxWithLoss definition at the beginning of this post.

Welcome to StackOverflow. Please read and follow the posting guidelines in the help documentation. [Minimal, complete, verifiable example](http://stackoverflow.com/help/mcve) applies here. We cannot effectively help you until you post your MCVE code and accurately describe the problem. — Prune, Apr 04 '17 at 18:48
That said, it's still hard to guess without seeing the values used, or enough output to see the variability in loss function. It's vaguely possible that this is actually a loss very close to 0, but pushed over the edge by round-off error. I suspect more that there's a computational error within the model that runs that probability value over the theoretical 1.0 boundary. — Prune, Apr 04 '17 at 18:50

score 0 · Answer 1 · answered Apr 06 '17 at 22:38

Is caffe run on GPU or CPU ? Print out prob_data that you get after softmax operation:

// find the next line in your cpu or gpu Forward function
softmax_layer_->Forward(softmax_bottom_vec_, softmax_top_vec_);
// make sure you have data in cpu
const Dtype* prob_data = prob_.cpu_data();

for (int i = 0; i < prob_.count(); i++) {
    printf("%f ", prob_data[i]);
}

caffe output the negative loss value with SoftmaxWithLoss layer?

1 Answers1