tensorflow softmax_cross_entropy code

Question

Since the source code of tf.nn.softmax_cross_entropy_with_logits in gen_nn_ops is hidden, could anyone perhaps explain me how tensorflow compute the cross entropy after Softmax. I mean, after softmax it might output 0 because of precision which will give rise to a NaN problem with cross entropy. Did tensorflow use clip method when softmax to bound the output of it?

score 0 · Accepted Answer · answered Oct 31 '17 at 14:18

The implementation of tf.nn.softmax_cross_entropy_with_logits further goes to native C++ code, here is XLA implementation. Logits are not bound and 0 is possible when one of the logits is much bigger than others. Example:

>>> session.run(tf.nn.softmax([10.0, 50.0, 100.0, 200.0]))
array([ 0.,  0.,  0.,  1.], dtype=float32)

If you wish, you can clip the logits just before the softmax, but it's not recommended, because it kills the gradient when the output is large. A better option is to use batch normalization to make the activations more like normally distributed.

tensorflow softmax_cross_entropy code

1 Answers1