0

I'm using ElasticSearch (via Ruby, Tire) for a search feature on an ecommerce clothing website. I need a stemming filter, BUT I also need to be able to specify a list of protected words which do not get stemmed. Currently I'm using the snowball filter for the stemming, but I can't figure out if it's possible to specify protected words. I've also looked at some other stemming filters:

  • Porter Stem seems to be too aggressive with it's stemming, leading to weird confusions
  • KStem seems to be english-only, and this is for a multilingual project
  • Stemmer claims to be like snowball but more feature-full, but I can't find any good documentation about it

My question is this: Is there a way to achieve these goals with snowball (and if so, how?) or do I need to switch to one of the other stemming filters?

awhitworth
  • 93
  • 7

1 Answers1

1

Use the Keyword Marker filter:

https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-keyword-marker-tokenfilter.html