1

In a recent project, I am faced with the task to convert plural nouns into singular forms. I know some POS tagging algortihms and tools that can recognize plural forms of nouns and tag them as 'NNS', but I did not know any algorithm that can cenvert them into singular forms. I have tried stemming, but stemming seems too aggressive to convert the word. It gives something like this:

parties -> parti

But what I want is:

fish -> fish
classes -> class
parties -> party
goods -> goods
cups -> cup

This seems to be a difficult problem without a huge dictionary with every English word in it. Is there any mature algortihm that can make it? I am also happy to learn if there is any library that can do this especially libraries in Java. Thanks.

Yuhao
  • 1,570
  • 1
  • 21
  • 32
  • 1
    Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it. – Jim Garrison May 30 '14 at 04:10
  • I have reformed the question. – Yuhao May 30 '14 at 04:16

1 Answers1

1

What you want is a lemmatizer instead of a stemmer. There are multiple implementations in java. I find Stanford CoreNLP easiest to use from the command line. Morpha is also fairly popular.

PS Your question is a duplicate. I'm answering because finding an answer to it through google is surprisingly hard.

mbatchkarov
  • 15,487
  • 9
  • 60
  • 79