2

I am trying to use seq2seq.dynamic_decode from Tensorflow to build a sequence to sequence model. I have already finished the encoder part. I am confused about the decoder as decoder_outputs seems to return [batch_size x sequence_length x embedding_size] but I need the actual word indices to correctly calculate my loss [batch_size x sequence_length]. I am wondering if one of my shape inputs is incorrect or if I just forgot something.
Decoder and encoder cell are rnn.BasicLSTMCell().

# Variables
cell_size = 100
decoder_vocabulary_size = 7
batch_size = 2
decoder_max_sentence_len = 7
# Part of the encoder
_, encoder_state = tf.nn.dynamic_rnn(
          cell=encoder_cell,
          inputs=features,
          sequence_length=encoder_sequence_lengths,
          dtype=tf.float32)
# ---- END Encoder ---- #
# ---- Decoder ---- #
# decoder_sequence_lengths = _sequence_length(features)
embedding = tf.get_variable(
     "decoder_embedding", [decoder_vocabulary_size, cell_size])
helper = seq2seq.GreedyEmbeddingHelper(
     embedding=embedding,
     start_tokens=tf.tile([GO_SYMBOL], [batch_size]),
     end_token=END_SYMBOL)
decoder = seq2seq.BasicDecoder(
     cell=decoder_cell,
     helper=helper,
     initial_state=encoder_state)
decoder_outputs, _ = seq2seq.dynamic_decode(
     decoder=decoder,
     output_time_major=False,
     impute_finished=True,
     maximum_iterations=self.decoder_max_sentence_len)
# I need labels (decoder_outputs) to be indices
losses = nn_ops.sparse_softmax_cross_entropy_with_logits(
        labels=labels, logits=logits)
loss = tf.reduce_mean(losses)
dparted
  • 395
  • 1
  • 2
  • 15

1 Answers1

3

I found the solution is:

from tensorflow.python.layers.core import Dense
decoder = seq2seq.BasicDecoder(
      cell=decoder_cell,
      helper=helper,
      initial_state=encoder_state,
      output_layer=Dense(decoder_vocabulary_size))
...
logits = decoder_outputs[0]

You have to specify a Dense layer to project from cell_size to vocabulary size.

dparted
  • 395
  • 1
  • 2
  • 15