0

Could someone explain why the following code generates the output of array([ 0.59813887, 0.69314718], dtype=float32) ? For example, numpy.log(0.5) = 0.69314718, but how does the 0.598138 come from ?

import tensorflow as tf
res1 = tf.nn.sparse_softmax_cross_entropy_with_logits(_sentinel=None, labels=[1, 0], logits=[[0.4, 0.6], [0.5, 0.5]], name=None)
res2 = tf.nn.sparse_softmax_cross_entropy_with_logits(_sentinel=None, labels=[0, 1], logits=[[0.4, 0.6], [0.5, 0.5]], name=None)
res3 = tf.nn.sparse_softmax_cross_entropy_with_logits(_sentinel=None, labels=[1, 0], logits=[[0.6, 0.4], [0.5, 0.5]], name=None)
sess = tf.Session()
sess.run(res1)
sunxd
  • 743
  • 1
  • 9
  • 24

2 Answers2

2

The logits that you have provided are for classes 0 and 1 respectively (that's how tensorflow understands it).

So, for res1 - prob(class1) is 0.6 for the 1st data point

By definition, Cross Entropy is -

-np.log(np.exp([0.6]) / np.sum(np.exp([0.4, 0.6])))

Similarly, for the second case -

-np.log(np.exp([0.5]) / np.sum(np.exp([0.5, 0.5])))

gives the desired output.

This is inline with Tensorflow's output. Hope this helps!

Vivek Kalyanarangan
  • 8,951
  • 1
  • 23
  • 42
0

it turns out that for this function, tensorflow interprete that to be an logit input, which means that it need to reverse the log(p/1-p) operation to get an softmax output first before it could compute cross entropy, but I have not find out why there is no functionality to compute the cross entorpy directly from the probablity output

Here is the answer from one post https://github.com/tensorflow/tensorflow/issues/2462

sunxd
  • 743
  • 1
  • 9
  • 24