2

How to get most frequent context words from pretrained fasttext model?

For example: For word 'football' and corpus ["I like playing football with my friends"]

Get list of context words: ['playing', 'with','my','like']

I try to use model_wiki = gensim.models.KeyedVectors.load_word2vec_format("wiki.ru.vec") model.most_similar("блок")

But it's not satisfied for me

CodeIt
  • 3,492
  • 3
  • 26
  • 37

1 Answers1

0

The plain model doesn't retain any such co-occurrence statistics from the original corpus. It just has the trained results: vectors per word.

So, the ranked list of most_similar() vectors – which isn't exactly words that appeared-together, but strongly correlates to that – is the best you'll get from that file.

Only going back to the original training corpus would give you exactly what you've requested.

gojomo
  • 52,260
  • 14
  • 86
  • 115