I want to implement tf.nn.sparse_softmax_cross_entropy by my self. But after some batches, loss became nan!
There is my code:
logits_batch_size = tf.shape(logits)[0]
labels = tf.reshape(tgt_seq, [-1])
eps = tf.fill(tf.shape(logits), 1e-8)
logits = logits + eps
labels_1 = tf.expand_dims(labels, 1)
index = tf.expand_dims(tf.range(0, logits_batch_size), 1)
concated = tf.concat([index, labels_1], 1)
onehot_labels = tf.sparse_to_dense(concated, tf.stack([logits_batch_size, tvsize]), 1.0, 0.0)
y_log = tf.log(tf.nn.softmax(logits))
cost = tf.reduce_mean(-tf.reduce_sum(tf.multiply(onehot_labels, y_log), 0))
logits is the same as the logits in tf.nn.sparse_softmax_cross_entropy, a 2-D tensor, tgt_seq is a 2-D tensor, too. My task is a sequence-to-sequence learning task.
Can anyone help me?