I just want to know the effect of the value of alpha in gensim word2vec
and fasttext
word-embedding models? I know that alpha is the initial learning rate
and its default value is 0.075
form Radim blog.
What if I change this to a bit higher value i.e. 0.5 or 0.75? What will be its effect? Does it is allowed to change the same? However, I have changed this to 0.5 and experiment on a large-sized data with D = 200, window = 15, min_count = 5, iter = 10, workers = 4 and results are pretty much meaningful for the word2vec model. However, using the fasttext model, the results are bit scattered, means less related and unpredictable high-low similarity scores.
Why this imprecise result for same data with two popular models with different precision? Does the value of alpha
plays such a crucial role during building of the model?
Any suggestion is appreciated.