I am comparing techniques and want to find out what is the best method to vector and reduce dimensions of a large number of text documents. I have already tested Bag of Words and TF-IDF and reduced dimensions with PCA, SVD, and NMF. Using these approaches I can reduce my data and know the best number of dimensions based on the variance explained.
However, I want to do the same with doc2vec, considering that doc2vec itself is a dimensional reducer, what is the best approach to find out the number of dimensions for my model? Is there any statistical measure that helps me find the best number of vector_size?
Thanks in advance!