2

can anybody tell me which default values are used in Doc2Vec() for alpha and min_alpha?

1 Answers1

3

The exact defaults for all parameters are listed in the documentation – but might, for parameters shared with a 'base' class, be shown in that superclass's docs.

So when you don't see alpha and min_alpha shown on the prototype-line of the Doc2Vec documentation....

https://radimrehurek.com/gensim/models/doc2vec.html#gensim.models.doc2vec.Doc2Vec

...you can click the link just under it, where it says...

Bases: gensim.models.word2vec.Word2Vec

...to reach its base class Word2Vec and find those & many more defaults specified:

https://radimrehurek.com/gensim/models/word2vec.html#gensim.models.word2vec.Word2Vec

Specifically, per the text there...

class gensim.models.word2vec.Word2Vec(sentences=None, corpus_file=None, vector_size=100, alpha=0.025, window=5, min_count=5, max_vocab_size=None, sample=0.001, seed=1, workers=3, min_alpha=0.0001, sg=0, hs=0, negative=5, ns_exponent=0.75, cbow_mean=1, hashfxn=, epochs=5, null_word=0, trim_rule=None, sorted_vocab=1, batch_words=10000, compute_loss=False, callbacks=(), comment=None, max_final_vocab=None)

...the defaults are alpha=0.025, min_alpha=0.0001.

Most users shouldn't need to tinker with these at all: most metaparameter optimization effort should be directed elsewhere.

In some published work, in some modes of this and related algorithms, I've seen a higher starting alpha of 0.05 or 0.1 used.

gojomo
  • 52,260
  • 14
  • 86
  • 115
  • The second link isn't working. The parameter space of doc2vec is still the same? I can't see the default values in the documentation https://radimrehurek.com/gensim/models/doc2vec.html#gensim.models.doc2vec.Doc2Vec. Thank you so much in advance! –  Aug 16 '21 at 21:52
  • The inheritance of `Doc2Vec` has changed a bit in the latest Gensim 4.0+ releases - now just under the `Doc2Vec` you should see "Bases: `gensim.models.word2vec.Word2Vec` - and if you click on that (`Word2Vec`), you'll see the addition inherited defaults. I'll update the main answer text. – gojomo Aug 17 '21 at 04:12
  • Thanks a lot! And as for Doc2Vec, I guess these are the default parameters? classgensim.models.doc2vec.Doc2Vec(documents=None, corpus_file=None, vector_size=100, dm_mean=None, dm=1, dbow_words=0, dm_concat=0, dm_tag_count=1, dv=None, dv_mapfile=None, comment=None, trim_rule=None, callbacks=(), window=5, epochs=10, **kwargs) –  Aug 17 '21 at 09:49
  • Whatever it says in the docs link for that class (https://radimrehurek.com/gensim/models/doc2vec.html#gensim.models.doc2vec.Doc2Vec), for the version of Gensim you're using, will be the defaults. If I were to specifically confirm what you've pasted here, your text could fall out of date. But that link is the docs, auto-generated from the source-code, so it's the right place to consult. – gojomo Aug 17 '21 at 19:11
  • You can also browse the source directly for such info. Here's a link to the extact line-number where the `Doc2Vec` defaults are set, in the `__init__()` method, in current development code: https://github.com/RaRe-Technologies/gensim/blob/develop/gensim/models/doc2vec.py#L159 (As the code changes over time, this link may no longer point to the exact right line.) – gojomo Aug 17 '21 at 19:13