Difference between CrossEntropyLoss and NNLLoss with log_softmax in PyTorch?

Question

When I am building a classifier in PyTorch, I have 2 options to do

Using the nn.CrossEntropyLoss without any modification in the model
Using the nn.NNLLoss with F.log_softmax added as the last layer in the model

So there are two approaches.

Now, what approach should anyone use, and why?

What kind of model are you using? – ProgrammingEnthusiast Jan 11 '21 at 15:21 — ProgrammingEnthusiast, Jan 11 '21 at 15:21
A CNN based image classifier. – Arkadip Bhattacharya Jan 11 '21 at 15:24 — Arkadip Bhattacharya, Jan 11 '21 at 15:24

ndrwnaguib · Accepted Answer · 2022-10-09T04:41:37.767

2

They're the same.

If you check the implementation, you will find that it calls nll_loss after applying log_softmax on the incoming arguments.

return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)

Edit: seems like the links are now broken, here's the C++ implementation which shows the same information.

edited Oct 09 '22 at 04:41

answered Jan 11 '21 at 15:30

ndrwnaguib

5,623
3
28
51

score 0 · Answer 2 · answered Jan 11 '21 at 16:22

Both the cross-entropy and log-likelihood are two different interpretations of the same formula. In the log-likelihood case, we maximize the probability (actually likelihood) of the correct class which is the same as minimizing cross-entropy. Though you're correct both of these have created some ambiguity in the literature, however, there are some subtleties and caveats, I would highly suggest you go through this thread, as this topic has been rigorously discussed there. You may find it useful.

Cross-Entropy or Log-Likelihood in Output layer

Difference between CrossEntropyLoss and NNLLoss with log_softmax in PyTorch?

2 Answers2