Given a set of words V
, I would like to group the synonym words in V
together. I am wondering if there is any built-in function in NLTK and Wordnet that takes V
as the input and automatically cluster them based on synonymity.
I already know how to extract the synonym of each word, but this is not what I am looking for. If I do so, the problem becomes complicated when the synonym sets are intersecting each other, or being subset/superset of each other, which needs writing a function removing the conflicts.
As an example, let's consider
V = ["good","constipate","bad","nice","defective","right","respectable","powerful"]
What I want to get as output is:
[('constipate'), ('nice'), ('bad', 'defective'), ('good', 'powerful', 'respectable', 'right')]
Now based on the size/number of the clusters, some sets might split into several sets, or combine together. Here, I am just caring for the words in V
and their synonyms in V
.