0

When I am building a classifier in PyTorch, I have 2 options to do

  1. Using the nn.CrossEntropyLoss without any modification in the model
  2. Using the nn.NNLLoss with F.log_softmax added as the last layer in the model

So there are two approaches.

Now, what approach should anyone use, and why?

Ivan
  • 34,531
  • 8
  • 55
  • 100

2 Answers2

2

They're the same.

If you check the implementation, you will find that it calls nll_loss after applying log_softmax on the incoming arguments.

return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)

Edit: seems like the links are now broken, here's the C++ implementation which shows the same information.

ndrwnaguib
  • 5,623
  • 3
  • 28
  • 51
0

Both the cross-entropy and log-likelihood are two different interpretations of the same formula. In the log-likelihood case, we maximize the probability (actually likelihood) of the correct class which is the same as minimizing cross-entropy. Though you're correct both of these have created some ambiguity in the literature, however, there are some subtleties and caveats, I would highly suggest you go through this thread, as this topic has been rigorously discussed there. You may find it useful.

Cross-Entropy or Log-Likelihood in Output layer

Khalid Saifullah
  • 747
  • 7
  • 16