I am new to NLP and Keras and am still learning.
I tried to follow this guide: https://blog.keras.io/a-ten-minute-introduction-to-sequence-to-sequence-learning-in-keras.html and have added an embedding layer. I am using fra2eng dataset.
However, I am not sure if my inference model and generation of output code are correct. Basically, my decoder input is an array of an index (a single number) feed into my inference model.
I am not very sure if this is correct. Do let me know if more background info or code is needed.
input_seq is an array of index of word in vocabulary.
array([36., 64., 57., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0.], dtype=float32)
0 is my index of start token so my first target seq is np.array([0]) --> not sure if its correct
def decode_sequence(input_seq):
# Encode the input as state vectors.
states_value = encoder_model.predict(input_seq)
target_seq = np.array([0])
stop_condition = False
decoded_sentence = ''
while not stop_condition:
output_tokens, h, c = decoder_model.predict([target_seq] + states_value)
sampled_token_index = np.argmax(output_tokens[0, -1, :])
sampled_char = target_idx2char[sampled_token_index]
decoded_sentence += sampled_char
if (sampled_char == '\n' or len(decoded_sentence) > 20):
stop_condition = True
# Update the target sequence (of length 1).
target_seq = np.array([sampled_token_index])
# Update states
states_value = [h, c]
return decoded_sentence
Here is my output, was wondering if the output is due to any error above.
Input sentence: Go.
Decoded sentence: tréjous?!
-
Input sentence: Run!
Decoded sentence: ï
-
Input sentence: Run!
Decoded sentence: ï
-
Input sentence: Wow!
Decoded sentence: u te les fois.
-
Input sentence: Fire!
Decoded sentence: ïï
-
Input sentence: Help!
Decoded sentence: ez joi de l'argent.