I have confusion of the similarity value of the Li measure. I read in paper that the values in (0,1) so Is it varies between 0 and 1? or it has only two values either 0 or 1.
-
1Since that's a similarity value, I'm almost sure that (0, 1) is a value range. A binary similarity measure would be a really poor identifier! However, providing a link to that article would be useful (we don't know the publication's title either...) – Andrea Feb 04 '14 at 08:44
1 Answers
Do you mean the Lin Similarity? It scales between 0 and 1 and is computed as following (citing the NLTK documentation):
Lin Similarity: Return a score denoting how similar two word senses are, based on the Information Content (IC) of the Least Common Subsumer (most specific ancestor node) and that of the two input Synsets. The relationship is given by the equation 2 * IC(lcs) / (IC(s1) + IC(s2)).
>>> dog.lin_similarity(cat, semcor_ic) 0.88632886280862277
You can find a comparison between different word similarity measurements in this paper. It explains the Lin Similarity as follows:
The lin and jcn measures augment the information content of the LCS with the sum of the information content of concepts A and B themselves. The lin measure scales the information content of the LCS by this sum, while jcn takes the difference of this sum and the information content of the LCS.

- 4,251
- 2
- 28
- 52