I tried to write my own estimator model_fn() for a GCP ML Engine package. I decoded a sequence of outputs using embedding_rnn_decoder as shown below:
outputs, state = tf.contrib.legacy_seq2seq.embedding_rnn_decoder(
decoder_inputs = decoder_inputs,
initial_state = curr_layer,
cell = tf.contrib.rnn.GRUCell(hidden_units),
num_symbols = n_classes,
embedding_size = embedding_dims,
feed_previous = False)
I know that outputs is "A list of the same length as decoder_inputs of 2D Tensors" but I am wondering how I can use this list to calculate the loss function for the entire sequence?
I know that if I grab outputs[0] (ie. grab only the first sequence output) then I could loss by following:
logits = tf.layers.dense(
outputs[0],
n_classes)
loss = tf.reduce_mean(
tf.nn.sparse_softmax_cross_entropy_with_logits(
logits=logits, labels=labels)
Is it appropriate to generate a loss value for each out the items in output and then pass these all to tf.reduce_mean? This feels inefficient, especially for long sequences -- are there any other ways to calculate the softmax at each step of the sequence that would be more efficient?