0

Hi I am trying to search a word which has these characters in it '(' , ')' in elastic search. I am not able to get expected result.

This is the query I am using

{
"query": {
    "query_string" : {
        "default_field" : "name",
        "query" : "\\(Pas\\)ta\""
    }
}}

In the results I am getting records with "PASTORS" , "PAST", "PASCAL", "PASSION" first. I want to get name 'Pizza & (Pas)ta' in the first record in the search result as it is the best match.

Here is the analyzer for the name field in the schema

"analysis": {
  "filter": {
    "autocomplete_filter": {
      "type": "edge_ngram",
      "min_gram": "1",
      "max_gram": "20"
    }
  },
  "analyzer": {
    "autocomplete": {
      "type": "custom",
      "tokenizer": "standard",
      "filter": [
        "lowercase",
        "autocomplete_filter"
      ]
    }
  }

 "name": {
        "analyzer": "autocomplete",
        "search_analyzer": "standard",
        "type": "string"
   },

Please help me to fix this, Thanks

Ramesh
  • 1,872
  • 2
  • 20
  • 33

1 Answers1

2

You have used standard tokenizer which is removing ( and ) from the tokens generated. Instead of getting token (pas)ta one of the token generated is pasta and hence you are not getting match for (pas)ta.

Instead of using standard tokenizer you can use whitespace tokenizer which will retain all the special characters in the name. Change analyzer definition to below:

  "analyzer": {
    "autocomplete": {
      "type": "custom",
      "tokenizer": "whitespace",
      "filter": [
        "lowercase",
        "autocomplete_filter"
      ]
    }
  }
Nishant
  • 7,504
  • 1
  • 21
  • 34
  • Thanks! @Nishant Saini Is there any other side effects if I change tokenizer to whitespace from standard, Is there any way to pass analyzer in query without changing schema – Ramesh Aug 02 '19 at 08:09
  • Both are different `whitespace` tokenize based on space whereas `standard` tokenize provides grammar based tokenization. You can pass analyzer in query but that won't help as it will applied to the input string of query – Nishant Aug 02 '19 at 09:39
  • I canged tokenizer to Whitespace, Is it because of ""search_analyzer": "standard" , If so how to fix that – Ramesh Aug 13 '19 at 11:01