2

I've trained a Fasttext model using .train_supervised() and can't get my head around how to get the most important words for each label according to the model.

I have three labels so I would expect to be able to do something like

model.label["__label__1"].get_most_significant()

Any suggestions on how to go about achieving this?

mattiasostmar
  • 2,869
  • 4
  • 17
  • 26

1 Answers1

1

I've not noticed any such feature in the original FastText code, so wouldn't expect it in the Python wrapper, either.

You might be able to get something vaguely like what you want by the process:

  • for every individual word, do a predict-with-probabilities of the top k labels for that one-word text – with k possibly as large as the count of all labels
  • from each such label prediction, add that word, with the label probability, into a log for that label
  • sort that log for each label to put the word that gave the highest probability first; take the top n results as the words most indicative of that label
gojomo
  • 52,260
  • 14
  • 86
  • 115
  • 1
    Hi Gojomo, first of all thanks for answering :). I had already started in the similar lines. I will update once I complete that.. (thumbs up) – Karthik Sunil Aug 06 '20 at 15:32