0

I want to implement tf.nn.sparse_softmax_cross_entropy by my self. But after some batches, loss became nan!

There is my code:

        logits_batch_size = tf.shape(logits)[0]
        labels = tf.reshape(tgt_seq, [-1])
        eps = tf.fill(tf.shape(logits), 1e-8)
        logits = logits + eps

        labels_1 = tf.expand_dims(labels, 1)
        index = tf.expand_dims(tf.range(0, logits_batch_size), 1)
        concated = tf.concat([index, labels_1], 1)
        onehot_labels = tf.sparse_to_dense(concated, tf.stack([logits_batch_size, tvsize]), 1.0, 0.0)
        y_log = tf.log(tf.nn.softmax(logits))
        cost = tf.reduce_mean(-tf.reduce_sum(tf.multiply(onehot_labels, y_log), 0))

logits is the same as the logits in tf.nn.sparse_softmax_cross_entropy, a 2-D tensor, tgt_seq is a 2-D tensor, too. My task is a sequence-to-sequence learning task.

Can anyone help me?


  • y_log = tf.log(tf.nn.softmax(logits)) ==> y_log = tf.log(tf.nn.softmax(logits) + eps). I guess cross_entropy is the key to NAN. So I want to implements tf.nn.sparse_softmax_cross_entropy by myself – tianchi Dec 04 '17 at 12:24
  • cost is what I called lost. tvsize is the size of target vocabulary – tianchi Dec 04 '17 at 12:27

0 Answers0