7

I need to use Wordnet in a java-based app. I want to:

  • search synsets

  • find similarity/relatedness between synsets

My app uses RDF graphs and I know there are SPARQL endpoints with Wordnet, but I guess it's better to have a local copy of the dataset, as it's not too big.

I've found the following jars:

What would you recommend for my app?

Is it possible to use a Perl library from a java app via some bindings?

Thanks! Mulone

Mulone
  • 3,603
  • 9
  • 47
  • 69

3 Answers3

12

I use JAWS for normal wordnet stuff because it's easy to use. For similarity metrics, though, I use the library located here. You'll also need to download this folder, containing pre-processed WordNet and corpus data, for it to work. The code can be used like this, assuming you placed that folder in another called "lib" in your project folder:

JWS ws = new JWS("./lib", "3.0");
Resnik res = ws.getResnik();
TreeMap<String, Double> scores1 = res.res(word1, word2, partOfSpeech);
for(Entry<String, Double> e: scores1.entrySet())
    System.out.println(e.getKey() + "\t" + e.getValue());
System.out.println("\nhighest score\t=\t" + res.max(word1, word2, partOfSpeech) + "\n\n\n");

This will print something like the following, showing the similarity score between each possible combination of synsets represented by the words to be compared:

hobby#n#1,gardening#n#1 2.6043996588901104
hobby#n#2,gardening#n#1 -0.0
hobby#n#3,gardening#n#1 -0.0
highest score   =   2.6043996588901104

There are also methods that allow you to specify which sense of either/both words: res(String word1, int senseNum1, String word2, partOfSpeech), etc. Unfortunately, the source documentation is not JavaDoc, so you'll need to inspect it manually. The source can be downloaded here.

The available algorithms are:

JWSRandom(ws.getDictionary(), true, 16.0);//random number for baseline
Resnik res = ws.getResnik();
LeacockAndChodorowlch = ws.getLeacockAndChodorow();
AdaptedLesk adLesk = ws.getAdaptedLesk();
AdaptedLeskTanimoto alt = ws.getAdaptedLeskTanimoto();
AdaptedLeskTanimotoNoHyponyms altnh = ws.getAdaptedLeskTanimotoNoHyponyms();
HirstAndStOnge hso = ws.getHirstAndStOnge();
JiangAndConrath jcn = ws.getJiangAndConrath();
Lin lin = ws.getLin();
WuAndPalmer wup = ws.getWuAndPalmer();

Also, it requires you to have the jar file for MIT's JWI

Nate Glenn
  • 6,455
  • 8
  • 52
  • 95
  • One item of note. I would get beta 11.01 instead of 11.02 if you get the package from http://www.cogs.susx.ac.uk/users/drh21/. – mj_ Aug 06 '11 at 01:17
  • @mj_ : why 11.01 and not 11.02 ? – damned Feb 12 '13 at 12:55
  • Does the code above gives the similarity between the different synsets?? – Noor Mar 07 '13 at 20:07
  • @Noor I edited the answer to give you the needed information. – Nate Glenn Mar 07 '13 at 21:24
  • This is an old thread, but I am getting a build error in eclipse saying jwi library has conflicting scala versions. Please advise.. – kavita Oct 10 '16 at 05:19
  • Sorry, no idea. I've never used Scala before, and this thread is 5 years old so I don't think I would remember anyway :/ – Nate Glenn Oct 11 '16 at 01:05
  • @NateGlenn Thank you for your answer, I want to ask you how can install and use this api. Please if you can give the steps for installation and all the required files. I'm using netbeans in windows 10. – F 505 Feb 13 '18 at 10:38
  • @F505 Sorry, this was 7 years ago and I don't remember much about it. Please open another question to ask about it. – Nate Glenn Feb 14 '18 at 14:23
1

There is function in JAWS to find similar wordForms Here are details:

public AdjectiveSynset[] getSimilar() throws WordNetException and here is link that you can check out: http://lyle.smu.edu/~tspell/jaws/doc/edu/smu/tspell/wordnet/AdjectiveSynset.html this link it contails details that you can use.

0

I am not sure if either JAWS or JWNL provide methods to calculate similarity between synsets, but I have tried both for searching synsets and I've found JAWS easier to use. Specifically, the simple:

    // Specifying the Database Directory
    System.setProperty("wordnet.database.dir", "C:/WordNet/2.1/dict/");

was easier for me to understand than JWNL's file_properties.xml requirement.

MrDrews
  • 2,139
  • 2
  • 22
  • 22