I used keras to train a seq2seq model (keras.models.Model). The X and y to the model are [X_encoder, X_decoder] and y i.e. a list of encoder and decoder inputs and labels (Note that the decoder input, X_decoder is ‘y’ with one position ahead than the actual y. Basically, teacher forcing).
So my question is now after training, when it comes to actual prediction where I do not have any labels how do I provide ‘X_decoder’ to my input? Or do I train on something else?
This is a snippet of the model definition if at all that helps:)
# Encoder
encoder_inputs = Input(batch_shape=(batch_size, max_len,), dtype='int32')
encoder_embedding = embedding_layer(encoder_inputs)
encoder_LSTM = CuDNNLSTM(hidden_dim, return_state=True, stateful=True)
encoder_outputs, state_h, state_c = encoder_LSTM(encoder_embedding)
# Decoder
decoder_inputs = Input(shape=(max_len,), dtype='int32')
decoder_embedding = embedding_layer(decoder_inputs)
decoder_LSTM = CuDNNLSTM(hidden_dim, return_state=True, return_sequences=True)
decoder_outputs, _, _ = decoder_LSTM(decoder_embedding, initial_state=[state_h, state_c])
# Output
outputs = TimeDistributed(Dense(vocab_size, activation='softmax'))(decoder_outputs)
model = Model([encoder_inputs, decoder_inputs], outputs)
# model fitting:
model.fit([X_encoder, X_decoder], y, steps_per_epoch=int(number_of_train_samples/batch_size),
epochs=epochs)