I am working with the Doc2Vec and Word2Vec deep learning algorithms (Doc2Vec API description from Gensim). More description here
Currently I am interested in using the model.n_similarity(wordSet1, wordSet2)
method which basically computes the cosine similarity between two sets of words.
I am interested in any ways of validating the models performance, not just on the n_similiarity()
function, but overall how accurate or realistic results can the model provide. Since it performs deep learning, I do not know if there is any ways of knowing how well does it perform.
Are there any techniques that I should look up, then use or is there a data-set that has results and I should compare ?
Any suggestion is much appreciated. Thank you.