0

i have a question about Porter Stemmer Algorithm, I researched on the internet,

but i couldn't find what the difference between understemming and overstemming.

and is the Porter Algorithm understemming or overstamming?

do you have an idea?

Thanks in advance

aldimeola1122
  • 806
  • 5
  • 13
  • 23

1 Answers1

1

Overstemming happens when the cut-off suffix is too long, this leads to spurious matching of unrelated words.

Understemming is the opposite -- e.g. a stemmer that doesn't cut off anything inherently understems.

Porter Stemmer, I suspect, will do both types of errors from time to time, for English. Note that implementations for other languages might behave very differently (speaking about Snowball which has user-supplied algorithms for a bunch of languages). They may even differ in the linguistic definition of stem.

ales_t
  • 1,967
  • 11
  • 10