stemmed word using hunspell module

Asked Oct 28 '18 at 09:21

Active Oct 28 '18 at 20:31

Viewed 415 times

I am using 2 modules for NLP one is nltk and the other is hunspell. The reason of using hunspell is that I have suffix and affix rules those needs to be followed.

from nltk.stem.porter import *
stemmer = PorterStemmer()
stemmer.stem('ladies')

ladi

from nltk.stem import WordNetLemmatizer
lemmatizer = WordNetLemmatizer()
lemmatizer.lemmatize('ladies')

lady

The nltk module works as expected as shown above. But hunspell module seems to support only lemmatization and there is no way to return stemmed form.

import hunspell
hobj = hunspell.HunSpell('en_US.dic', 'en_US.aff')
hobj.stem('ladies')

This returns "lady" and not "ladi" as one would expect. Is there any way to return the stemmed form of a word using hunspell module?

edited Oct 28 '18 at 20:31

David Batista

3,029
2
23
42

asked Oct 28 '18 at 09:21

shantanuo

31,689
78
245
403

2

Because `stemmer != lemmatizer` and `(stemmer | lemmatizer) != spellchecker`. It's an XY sort of problem to conflate stemmer (Porter), lemmatizer (Wordnet morhpy) and spellchecker (Hunspell) ;P – alvas Oct 28 '18 at 14:59
@alvas, in my opinion with well written "dictionaries", hunspell is far more the a spellchecker (is is a lexical analyzer, including lemmatization) . – JJoao Jun 04 '19 at 10:35

stemmed word using hunspell module

0 Answers0