1

I have an Index as follows:

{
  "entities": {
    "mappings": {
      "properties": {
        "content": {
          "type": "text",
          "analyzer": "stop_delimiter_stemmer_analyzer"
        }
      }
    }
  }
}

And following is stop_delimiter_stemmer_analyzer (my custom analyzer):

"analysis": {
  "analyzer": {
    "stop_delimiter_stemmer_analyzer": {
      "tokenizer": "whitespace",
      "filter": [
        "word_delimiter_graph",
        "german_stemmer",
        "english_stemmer",
        "french_stemmer",
        "italian_stemmer",
        "multi_language_stopwords"
      ],
    }
  },
  "filter": {
    "german_stemmer": {
      "type": "stemmer",
      "name": "light_german"
    },
    "english_stemmer": {
      "type": "stemmer",
      "name": "english"
    },
    "french_stemmer": {
      "type": "stemmer",
      "name": "light_french"
    },
    "italian_stemmer": {
      "type": "stemmer",
      "name": "light_italian"
    },
    "multi_language_stopwords": {
      "type": "stop",
      "stopwords": [
        "_english_",
        "_french_",
        "_italian_",
        "_dutch_"
      ]
    }
  }
}

If I use the match query to search Preuve à futur, Elasticsearch finds it as the first result.
But if I search it as preuve à futur, It finds it in so much lower in ranking.

I need to add the case-insensitive exact match to my search in order to find exact matches (case-insensitive or case-sensitive) in the first results.
How can I do that?
thanks.

Note: I use Elasticsearch 7.16

msln
  • 1,318
  • 2
  • 19
  • 38

1 Answers1

1

Just use the lowercase token filter as the first item in your analyzer definition's filter list, this way all tokens will be indexed lowercase and searching time also as match query uses the same analyzer, search string will also be tokenised lowercase and you will be able to get result in a case insensitive manner.

"filter": [
  "lowercase",
  "word_delimiter_graph",
  "german_stemmer",
  "english_stemmer",
  "french_stemmer",
  ...
]
msln
  • 1,318
  • 2
  • 19
  • 38
Amit
  • 30,756
  • 6
  • 57
  • 88
  • I've tried it before. Doesn't work correct. – msln Dec 05 '22 at 08:44
  • where you put lowecase in the list? can you put it at first and try again, you need to reindex your data after the change.. and what issues you faced with it – Amit Dec 05 '22 at 08:46
  • I just tested it in the last item of `filter` list. Does the order matter? Let me try it as the first item on the filter list. (and yes I know I have to reindex the data again) – msln Dec 05 '22 at 08:48