I have a text file with my precomputed word vectors in the following format (example):
word -0.0762464299711 0.0128308048976 ... 0.0712385589283\nā
on each line for every word (with 297 extra floats in place of the ...
). I am trying to load these with Gensim as KeyedVectors, because I ultimately would like to compute the cosine similarity, find most similar words, etc. Unfortunately I have not worked with Gensim before and from the documentation it's not quite clear to me how to do this. I have tried the following which I found here:
word_vectors = KeyedVectors.load_word2vec_format('/embeddings/word.vectors', binary=False)
However this gives the following error:
ValueError: invalid literal for int() with base 10: 'the'
'the' is the first word in the text file, so I suspect that the loading function is expecting something to be there that is not. But I can't find any information on what should be there. I would highly appreciate a pointer to such information or any other solution to my problem. Thanks!