my task is to assign tags (descriptive words) to documents or posts from the list of available tags. I'm working with Doc2vec available in Gensim. I read that doc2vec can be used for document tagging. But i could not get the suitable parameter values for this task. Till now, i have tested it by changing value of parameters named 'size' and 'window'. The results i'm getting are too nonsense and also by changing values of these parameters i haven't find any trend in results i.e. at some values results got little bit improved and at some values results fall down. Can anyone suggest what should be suitable parameter values for this task? I found that 'size'(defines size if feature vector) should be large if we have enough training data. But about the rest of parameters, i am not getting sure!
Asked
Active
Viewed 391 times
1 Answers
0
Which parameters are best can vary with the quality & size of your training data, and exactly what your downstream goals are. (There's no one set of best-for-everything parameters.)
Starting with the gensim defaults is reasonable first guess, or other values you've see someone else having used successfully on a similar dataset/problem.
But really you'll need to experiment, ideally by creating an automated evaluation based on some held-back testing set, then meta-optimizing the Doc2Vec
parameters by searching over many small adjustments to the parameters for the best ranges/combinations.

gojomo
- 52,260
- 14
- 86
- 115