Value of alpha in gensim word-embedding (Word2Vec and FastText) models?

Question

I just want to know the effect of the value of alpha in gensim word2vec and fasttext word-embedding models? I know that alpha is the initial learning rate and its default value is 0.075 form Radim blog.

What if I change this to a bit higher value i.e. 0.5 or 0.75? What will be its effect? Does it is allowed to change the same? However, I have changed this to 0.5 and experiment on a large-sized data with D = 200, window = 15, min_count = 5, iter = 10, workers = 4 and results are pretty much meaningful for the word2vec model. However, using the fasttext model, the results are bit scattered, means less related and unpredictable high-low similarity scores.

Why this imprecise result for same data with two popular models with different precision? Does the value of alpha plays such a crucial role during building of the model?

Any suggestion is appreciated.

score 5 · Accepted Answer · answered Dec 17 '18 at 18:47

5

The default starting alpha is 0.025 in gensim's Word2Vec implementation.

In the stochastic gradient descent algorithm for adjusting the model, the effective alpha affects how strong of a correction to the model is made after each training example is evaluated, and will decay linearly from its starting value (alpha) to a tiny final value (min_alpha) over the course of all training.

Most users won't need to adjust these parameters, or might only adjust them a little, after they have a reliable repeatable way of assessing whether a change improves their model on their end tasks. (I've seen starting values of 0.05 or less commonly 0.1, but never as high as your reported 0.5.)

answered Dec 17 '18 at 18:47

gojomo

52,260
14
86
115

Thanks, gojomo. I was expecting this answer. Okay, I will also go with alpha value as 0.1 or within it. 1 query, does it is mandatory to specify the model.train() parameter. What happen if I don't use it after building the model? If I use the model.train parameter, what could be the desired values for model.train() parameter, after building the model with a certain size, window, iter and min_count values. – M S Dec 17 '18 at 20:38
I have computed the similarity between two words, without using model.train() values. Does it is mandatory to specify model.train() values? – M S Dec 17 '18 at 20:40
1

I wouldn't change the `alpha` default until after everything else is working, & you have a good way to evaluate your models to see if changing `alpha` helps. – gojomo Dec 17 '18 at 20:48
1

All models must be trained to be useful, but if you supplied the corpus iterable in the model constructor, it will have automatically done the training. If you didn't supply a corpus, you'll have to do the vocabulary-initialization (with method `build_vocab()`) and then training (with method `train()`) yourself. The docs for `train()` are clear about what parameters must be supplied: https://radimrehurek.com/gensim/models/word2vec.html#gensim.models.word2vec.Word2Vec.train – gojomo Dec 17 '18 at 20:49
Yeah! Changing of alpha value to 0.5 gives me some meaningful desired sensed results. – M S Dec 17 '18 at 20:49
What is corpus iterable? Does It means tokenized words specified as sentences in a nested list or something else? – M S Dec 17 '18 at 20:52
I'm am surprised a starting `alpha` of `0.5` is working, that's very far from usual values. Is that really better than just leaving it at its `0.025` default? How much data (total number of examples and average size in words of each example) are you using? What method of evaluating their quality are you using? – gojomo Dec 18 '18 at 04:41
1

The corpus iterable object is the `sentences` parameter – optional in the class constructor. For `Word2Vec`, it should be an iterable sequence of lists-of-tokens. If provided to the class constructor, all vocabulary-discovery and training will happen automatically – you won't then need to call `train()`. (If you have more questions, you should expand your question with a clear excerpt of the code you're using.) – gojomo Dec 18 '18 at 04:43
Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/185400/discussion-between-m-s-and-gojomo). – M S Dec 18 '18 at 08:29

Value of alpha in gensim word-embedding (Word2Vec and FastText) models?

1 Answers1