0

I have index and data described here, and also I have set analyzer to stop analyzer. That works fine, because when I try simple search like POST https://serverURL/_search?pretty=true

{
  "query": {
    "query_string": {
      "default_field": "title",
      "query": "Rebel the without"   }
 }
}

, server really returns

            "title": "Rebel Without a Cause"

as result.

But, when I try to use fuzzy search

{
  "query": {
    "fuzzy": {
      "title": {
        "value": "Rebel the without"
      }
    }
  }
}

, the result is empty. What is exactly going on here, does fuzzy search somehow disable analyzer ?

2 Answers2

1

Fuzzy query returns documents that contain terms that seem to be similar to the search term.

Since, you have not defined any explicit mapping for the "title" field, it uses standard analyzer, where the token generated will be :

{
  "tokens" : [
    {
      "token" : "rebel",
      "start_offset" : 0,
      "end_offset" : 5,
      "type" : "<ALPHANUM>",
      "position" : 0
    },
    {
      "token" : "without",
      "start_offset" : 6,
      "end_offset" : 13,
      "type" : "<ALPHANUM>",
      "position" : 1
    },
    {
      "token" : "a",
      "start_offset" : 14,
      "end_offset" : 15,
      "type" : "<ALPHANUM>",
      "position" : 2
    },
    {
      "token" : "cause",
      "start_offset" : 16,
      "end_offset" : 21,
      "type" : "<ALPHANUM>",
      "position" : 3
    }
  ]
}

The fuzzy query will give you result for those search terms which are similar to the token generated like wihout, case, rebe, etc

GET /myidx/_search
{
  "query": {
    "fuzzy": {
      "title": {
        "value": "case"
      }
    }
  }
}

Update 1:

Based on the comments below, you can use match bool prefix query

{
  "query": {
    "match_bool_prefix": {
      "title": {
        "query": "Rebel the without"
      }
    }
  }
}
ESCoder
  • 15,431
  • 2
  • 19
  • 42
  • My problem is exactly that standard analyzer is used instead of stop analyzer. According to https://www.elastic.co/guide/en/elasticsearch/reference/current/specify-analyzer.html, if no analyzer is defined on field level, default index analyzer is used. And I have set index analyzer to stop. So, what could be the reason for fuzzy search to use standard analyzer instead of stop analyzer ? – Dragan Jovanović Jan 09 '22 at 17:42
  • @Dragan Jovanović **Fuzzy query returns documents that contain terms that seem to be similar to the search term.** Can you please tell what is your use case, to remove stop words in fuzzy query ? – ESCoder Jan 09 '22 at 17:53
  • Exactly. My case is that I have to use fuzzy logic (to retrieve words that are similar to search term), and also, to ignore stop words during the process. This is from my spec : "when the user types “stones and rocks” the search should return results that match “stone rocks things”. Notice: The user mistyped stones (extra s) , added stop word “and”. " – Dragan Jovanović Jan 09 '22 at 17:59
  • 1
    @DraganJovanović please go through the updated part of the answer, and let me know if this resolves your issue ? – ESCoder Jan 09 '22 at 18:39
1

It's important to understand how data is processed and stored in Elasticsearch to understand this behavior. So when you set up stop analyzer, any text you feed to the system is transformed into a list of tokens aka terms. At this point Elasticsearch field "doesn't remember" your original text (technically, it's stored in the _source field but it's not indexed) and only knows those terms (each coupled with its position in the original text, in your case - rebel, without, cause) which then get stored in an inverted index for quick lookup.

Now you run the fuzzy query - it's a term-level query which means it works against particular terms. Instead, you have to use full-text queries, like match:

POST /fuzz/_search
{
  "query": {
    "match": {
      "title": {
        "query": "Reble without",
        "fuzziness": "AUTO"
      }
    }
  }
}
ilvar
  • 5,718
  • 1
  • 20
  • 17