Fast Text unsupervised model loss

Question

I wanted to create a fastText unsupervised model for my text data of size 1GB. I'm using fastText command line tool to implement the model training process.

./fasttext skipgram -input PlainText.txt -output FastText-PlainText- -dim 50 -epoch 50

The above are few arguments I used for created word representation.

Read 207M words
Number of words:  501986
Number of labels: 0
Progress:  97.5% words/sec/thread:   87224 lr:  0.001260 avg.loss:  0.089536 ETA:   0h 4m 9s

Here, in the output of the fastText command, I see this avg.loss and the learning rate has been decreased from default (0.5) to 0.001. I don't really understand, what does this avg.loss mean and why is the learning rate is dropped?

Should I want to increase the epoch to make fastText to learn my data better?
Can I use any loss function to improve the loss? If yes, what kind of loss function will be better?
And how can I evaluate my fastText model's learning whether is good or bad?
Just out of interest, Can I use wordngrams to make my model learn better with context in unsupervised learning?

If you found my answer useful, please accept it (https://meta.stackexchange.com/questions/86978/how-do-i-accept-an-answer-on-stackoverflow). — Stefano Fiorucci - anakin87, Jul 27 '21 at 07:25

score 4 · Accepted Answer · answered Jul 21 '21 at 14:31

I can't answer all your questions in depth, but I try to give you some advice.

you can understand better avg.loss, reading this thread
learning rate is updated according lrUpdateRate option (read this).
in general, increasing the number of epochs can improve learning. However, as you can read in this paper, the most popular language models have a number of epochs between 10 and 100.
default loss function is softmax. You can also choose hs (hierarchical softmax) or ns. You can read more in the official tutorial.
if you want to learn more about the effects of the ws and wordngrams parameters, you can read this answer.

Fast Text unsupervised model loss

1 Answers1