0

I have a seq to seq model trained of some clever bot data:

justphrases_X is a list of sentences and justphrases_Y is a list of responses to those sentences.

maxlen = 62   
#low is a list of all the unique words.




def Convert_To_Encoding(just_phrases): 
    encodings = []
    for sentence in just_phrases:
        onehotencoded = one_hot(sentence, len(low))
        encodings.append(np.array(onehotencoded))
        
    encodings_padded = pad_sequences(encodings, maxlen=maxlen, padding='post', value = 0.0)

    return encodings_padded

encodings_X_padded = Convert_To_Encoding(just_phrases_X)
encodings_y_padded = Convert_To_Encoding(just_phrases_y)

model = Sequential()       
embedding_layer = Embedding(len(low), output_dim=8, input_length=maxlen)
model.add(embedding_layer)
model.add(GRU(128)) # input_shape=(None, 496)
model.add(RepeatVector(numberofwordsoutput)) #number of characters?
model.add(GRU(128, return_sequences = True)) 
model.add(Flatten())
model.add(Dense(62, activation = 'softmax'))
model.compile(loss = 'categorical_crossentropy', optimizer= 'adam', metrics=['accuracy'])
model.summary()

model.fit(encodings_X_padded, encodings_y_padded, batch_size = 1, epochs=1) #, validation_data = (testX, testy)
model.save("cleverbottheseq-uel.h5")

When I use this model for prediction, the output will be between 0 and 1 because of my use of softmax. However as I have around 3000 unique words, each with a separate integer assigned to it, how do I essentially repeat what the model did during training and convert the output back to an integer which has a word assigned to it?

Haztec
  • 31
  • 3

1 Answers1

1

I dont think it is possible to create seq2seq with Sequential API. Try to create encoder and decoder separately with Functional API. You need two inputs - first for encoder, second - for decoder.

Andrey
  • 5,932
  • 3
  • 17
  • 35