From the Levy and Goldberg paper, if you are trying to find analogies (or combining/comparing more than 2 word vectors), the first method (3CosAdd or eq.3 of paper) is more susceptible of getting dominated by 1 comparison, as compared to second method (3CosMul or eq.4 of paper).
Just for semantic similarity between 2 word vectors, this method doesn't apply.
Example, using Google News Vectors -
model.similarity('Mosul','England')
0.10051745730111421
model.similarity('Iraq','England')
0.14772211471143404
model.similarity('Mosul','Baghdad')
0.83855779792754492
model.similarity('Iraq','Baghdad')
0.67975755642668911
Now Iraq is closer to England than Mosul (both being countries), however similarity values are small ~ 0.1.
On the other hand Mosul is more similar to Baghdad than Iraq (geographical/cultural aspects), with similarity values of higher order ~ 0.7
Now, for analogy (England - London + Baghdad = X) -
3CosAdd being a linear sum, allows one large similarity term to dominate the expression. It ignores that each term reflects a different aspect of similarity,
and the different aspects have different scales.
3CosMul, on the other hand - amplifies the differences between small quantities and reduces the differences between larger ones.
model.most_similar(positive=['Baghdad', 'England'], negative=['London'])
(u'Mosul', 0.5630180835723877)
(u'Iraq', 0.5184929370880127)
model.most_similar_cosmul(positive=['Baghdad', 'England'], negative=['London'])
(u'Mosul', 0.8537653088569641)
(u'Iraq', 0.8507866263389587)