Why cleverhans pytorch tutorial using log_softmax instead of logits as output

Question

When generating adversarial examples, it is typically using logits as the output of the neural network, and then train the network with cross-entropy.

However, I found that the tutorial of cleverhans uses log softmax and then convert the pytorch model to a tensorflow model, and finally train the model.

https://github.com/tensorflow/cleverhans/blob/master/cleverhans_tutorials/mnist_tutorial_pytorch.py#L65

I am wondering if anyone has the idea about whether using logits instead of log_softmax will make any difference?

score 0 · Answer 1 · answered Dec 13 '19 at 17:59

0

As you said, when we get logits from a neural network, we train it using CrossEntropyLoss. An alternative way is to compute the log_softmax and then train the network by minimizing the negative log-likelihood (NLLLoss).

Both approaches are basically the same if you are training a network for classification tasks. However, if you have a different objective function, you may find one of these two techniques, particularly useful in your scenario.

Reference

answered Dec 13 '19 at 17:59

Wasi Ahmad

35,739
32
114
161

I agree with what you said, yet my question is whether using logit Vs softmax will be different on adversarial attack performance. For example, in C&W attack, it is highly recommended to use logit instead of softmax. I assume there should be no big difference in some basic attacks, yet I have no idea whether this could be different in other attacks. – Denny Law Dec 13 '19 at 18:36
@DennyLaw from the perspective of training there's no difference (other than numerical stability) between CrossEntropyLoss and log_softmax followed by NLLLoss. If your talking about black box attacks then I could see why you may want to output softmaxed values since softmax isn't invertible and thus "hides" some information about your activations. – jodag Dec 13 '19 at 20:37
I appreciate that you point out the "numerical stability". To my best knowledge, logsoftmax may suffer from it, so I am curious about how other people think about it, and also I expect someone could confirm if there is "numerical stability" issue with the tutorial of cleverhans – Denny Law Dec 13 '19 at 21:24

Why cleverhans pytorch tutorial using log_softmax instead of logits as output

1 Answers1