4

Do you know any big enough lemmatizer database that returns correct result for following sample words:

geese: goose
plantes: //not found

Wordnet's morphological analyzer is not sufficient, since it gives the following incorrect results:

geese: //not found
plantes: plant
tchrist
  • 78,834
  • 30
  • 123
  • 180
Ali Shakiba
  • 20,549
  • 18
  • 61
  • 88

2 Answers2

2

MorphAdorner seems to be better at this, but it still finds the incorrect result for "plantes"

plantes: plante
geese: goose

Maybe you'd like to use MorphAdorner to do the lemmatization, and then check its results against WordNet. You can use the WordNet API to perform lookups without first performing lemmatization by calling findtheinfo_ds. This allows you to use a lemmatizer like MorphAdorner first. (If you wanted to use the lemmaitzer, you'd need to call morph separately and call findtheinfo_ds on the lemmas that it returned.)

On the other hand, I only spent about 5 seconds looking at MorphAdorner for this purpose, and there may be a way to eliminate the incorrect "plantes" answer without having to use any other outside resource.

Ken Bloom
  • 57,498
  • 14
  • 111
  • 168
  • Thanks, after unchecking "Standardize spelling" it returns `plante` which can be checked against wordnet to find out it's not correct (I'm using wordnet files directly). But do you have any idea what "Early Modern English/Nineteenth Century Fiction" option is, are it's corpus recent enough to be used for modern english? – Ali Shakiba Jun 10 '11 at 16:10
  • @Johns: That's what I was hinting at when I said I only spent about 5 seconds looking. Try looking through their documentation to see if it says anything. – Ken Bloom Jun 10 '11 at 16:13
1

Once upon a time, someone suggested Morpha to me, but I haven't used it, so I don't know if it does any better at this than WordNet does.

Community
  • 1
  • 1
Ken Bloom
  • 57,498
  • 14
  • 111
  • 168
  • It doesn't have an online demo but reading descriptions it seems that it is not based on a database. But I will try it if I didn't run short of time. – Ali Shakiba Jun 10 '11 at 16:19