11

I am currently working on word2vec model using gensim in Python, and want to write a function that can help me find the antonyms and synonyms of a given word. For example: antonym("sad")="happy" synonym("upset")="enraged"

Is there a way to do that in word2vec?

Salamander
  • 179
  • 5
  • 15

2 Answers2

9

In word2vec you can find analogies, the following way

model = gensim.models.Word2Vec.load_word2vec_format('GoogleNews-vectors-negative300.bin', binary=True)

model.most_similar(positive=['good', 'sad'], negative=['bad'])
[(u'wonderful', 0.6414928436279297),
 (u'happy', 0.6154338121414185),
 (u'great', 0.5803680419921875),
 (u'nice', 0.5683973431587219),
 (u'saddening', 0.5588893294334412),
 (u'bittersweet', 0.5544661283493042),
 (u'glad', 0.5512036681175232),
 (u'fantastic', 0.5471092462539673),
 (u'proud', 0.530515193939209),
 (u'saddened', 0.5293528437614441)]

Now using some standard antonyms like (good, bad), (rich, poor), find multiple such lists of nearest antonyms. After that you can use average of vectors of this list.

kampta
  • 4,748
  • 5
  • 31
  • 51
  • Why is averaging of vectors required here? – Salamander Aug 12 '15 at 14:58
  • 3
    For example, your seed set of antonyms is `ss = [('rich','poor'), ('good', 'bad')]`. Now, to find antonym of `sad`, you may do something like `antonym_candidates = [model.most_similar(positive=[ss[0][0], 'sad'], negative=[ss[0][1]]), model.most_similar(positive=[ss[1][0], 'sad'], negative=[ss[1][1]])]`. Now to choose best antonym, you can either (1) take the closest/ most frequent word (2) take (weighted) average of vectors of all possible candidates, and find word in vocabulary closest to this vector – kampta Aug 12 '15 at 15:30
  • So, in that case, I have to create a set of known antonyms. Am I right? – Salamander Aug 13 '15 at 18:04
  • Yes, because just one known pair of antonyms, is very likely to give a biased answer – kampta Aug 13 '15 at 18:18
  • 2
    -1; this cannot possibly work, because the "is an antonym" relationship is *symmetrical*; any function that maps a word to its antonym must logically be its own inverse. Clearly, there cannot be a single vector that you can add to any word to get its antonym because the operation of adding a vector is not its own inverse. – Mark Amery Apr 07 '18 at 14:48
  • Thanks for pointing out @MarkAmery. I agree its a hacky solution and example I've given might have worked coincidentally since `good` and `happy` are similar sentiments and so are `bad` and `sad` – kampta Apr 08 '18 at 14:19
  • Hi, I like the answer as a hacky solution. However, in case the word in question is positive (say, "happy", instead of "sad"), one would have to reverse the positions of "good" and "bad". But that would require pre-knowledge of the word being positive/negative. I tried another hacky solution to it, namely checking with both (good, bad) and (bad, good) combinations and whichever combination gives the higher number of matches with model.most_similar(word) is considered to be the sentiment attached to the word. Any suggestions for this? @kampta – prateek1592 Nov 05 '18 at 18:27
  • @MarkAmery Since a word can have multiple meanings, it can also have multiple antonyms. Hence, despite the relation being symmetrical, a function mapping any word to one of its antonyms in not necessarily its own inverse... – Rolvernew Nov 21 '18 at 11:13
  • @Rolvernew Hmm. That's an interesting and valid nitpick. However, I think the meat of my objection - that addition of a fixed vector cannot possibly transform an arbitrary word to one of its antonyms - still survives. For one thing, some words don't have multiple antonyms. For another, even in the case of words with multiple antonyms, the best possible case for the "add a vector" transformation is that, starting from a given word, there is a straight line along which antonyms and synonyms lie, alternating, at a regular interval, and even then, eventually that chain of words must *end*. – Mark Amery Nov 21 '18 at 11:20
0

I think it is possible to obtain antonym using king-men+women=queen analogies. in here queen (antonym of king and synonym of women) is the result that return from word2vec trained model. let we say there is a word X and its synonym Y. and also have antonym of Y which is Z. then we can say X-Y + Z = antonym of (X) and synonym of(Z).

Mulat
  • 1
  • 2
  • use model.most_similar(positive=['king', 'woman'], negative=['men']) but this may help you to expand the list antonym which is initially extracted by MT(machine translation from PWN(english word net)). remember women is antonyms of men which is extracted by MT from PWN. and king is synonym of men. this one also extracted from PWN and finally we use model.most_similar(positive=['king', 'woman'], negative=['men']). the above line of code returns queen which is antonym of king and synonym of women. Note: all the word extracted from the PWN should exist in your word2vec trained model. – Mulat Sep 05 '19 at 08:08