Tensorflow Tokenizer tokenizes and encodes text into machine readable vectors. First we call fit_on_texts
on some large amount of text to build a dictionary, then we call fit_on_sequences
on our input text to build the corresponding vectors incoding.
What does Keras Tokenizer method exactly do?
However, there seems not to be a built-in method for the reverse operation, for retrieving text from numerical vectors, based on the dictionary.
In Python something like this could be implemented
# map predicted word index to word
out_word=''
for word, index in tokenizer.word_index.items():
if index==yhat:
out_word=word
break
Is there a nice way to retrieve text from digit, in other words is there a built-in reverse operation of fit_to_sequences
?