I'm working on a basic German analyzer in Elasticsearch which is defined as follows
{
"settings": {
"analysis": {
"filter": {
"german_stemmer": {
"type": "snowball",
"language": "German"
},
"german_stop": {
"type": "stop",
"stopwords": "_german_"
}
},
"analyzer": {
"german_search": {
"filter": ["lowercase", "german_stop", "german_stemmer"],
"tokenizer": "standard"
}
}
}
}
}
While testing it I realized that it is not dealing well with Kürbis and Kürbisse. Stemming those two words brings different output while from my understanding (just what I read online) Kurbis stands for Pumpkin and Kurbisse is Pumpkins. It looks like the stemmer is not dealing well with plurals.
Any ideas on how can I solve this?