Hello I have some word2vec models generated using Word2Vec java implementation in DL4J and saved by calling
writeWord2VecModel(Word2Vec vectors, String path)
The output of that is a zip file that contains a bunch of txt files. I can successfully load and use the model in DL4j using
Word2Vec readWord2VecModel(String path)
I am now trying to read that model in python, using gensim
import gensim
model = gensim.models.KeyedVectors.load_word2vec_format('file_path, binary=False)
But I get the following error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe0 in position 10: invalid continuation byte
I also tried with binary=True and get same results.
If I extract the model generated by DL4J I get the following files:
Is there a way to read that model in python genism
?