26

We have an index of items with which I'm attempting to do fuzzy wildcard on the items name. the query

{
  "from": 0,
  "size": 10,
  "query": {
    "bool": {
      "must": {
        "query_string": {
          "fields": [
            "name.suggest"
          ],
          "query": "avacado*",
          "fuzziness": 0.7
        }
      }
    }
  }
}

the field in the index and the analyzers at play "

suggest_analyzer":{
    "type": "custom",
    "tokenizer": "standard",
    "filter": ["standard", "lowercase", "shingle", "punctuation"]
  }


"punctuation" : {
    "type" : "word_delimiter",
    "preserve_original": "true"
  }



"name": {
    "fields": {
      "name": {
        "type": "string",
        "analyzer": "stem"
      },
      "suggest":{ 
        "type": "string", 
        "analyzer": "suggest_analyzer"
      },
      "untouched": {
        "include_in_all": false,
        "index": "not_analyzed",
        "index_options": "docs",
        "omit_norms": true,
        "type": "string"
      },
      "untouched_lowercase": {
        "type": "string", 
        "index_analyzer": "lowercase",
        "search_analyzer": "lowercase"
      }
    },
    "type": "multi_field"
  },

The problem is this

An item with the name "Avocado Test" will match for the following

  • avocado*
  • avo*
  • avacado

but fails to match for

  • avacado*
  • ava*
  • ava~2

I cant seem to make fuzzy work with wildcards, it seems to be either fuzzy works or wildcards work but not in combination.

Es version is 1.3.1

Note that my query is simplified and we have other filtering going on but I boiled it down to just the query to take any ambiguity out of the results. I've attempted to use the suggest features but they won't allow the level of filtering we need.

Is there any other way to handle doing suggest/typeahead style searching with fuzziness to catch misspellings?

dstarh
  • 4,976
  • 5
  • 36
  • 68
  • 1
    looks like what you are looking for is a "fuzzy typeahead" you maybe able to achieve this via [completion suggester](http://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters-completion.html#search-suggesters-completion) – keety May 09 '15 at 18:13
  • Does [this answer](http://stackoverflow.com/questions/29712954/should-i-include-spaces-in-fuzzy-query-fields/29723235#29723235) help you any? – Sloan Ahrens May 10 '15 at 15:26
  • @keety a completion suggester would help if we didn't need to do any filtering. As it is each user doing the typeahead gets a specific subset of the documents in the index available to them via a meta_tagging system and other filters. We also have rules that state that items must not have tag x so we'd have to do negation which criteria won't yet allow – dstarh May 10 '15 at 19:47
  • @SloanAhrens That looks promising. I'll play around with that on monday – dstarh May 10 '15 at 19:49
  • 3
    For anyone else looking http://stackoverflow.com/questions/29712954/should-i-include-spaces-in-fuzzy-query-fields/29723235#29723235 worked perfectly – dstarh Mar 16 '17 at 18:36
  • Does this answer your question? [Should I include spaces in fuzzy query fields?](https://stackoverflow.com/questions/29712954/should-i-include-spaces-in-fuzzy-query-fields) – Alexandre Juma Feb 09 '21 at 16:17
  • @AlexandreJuma thats the same stack overflow I linked directly above your comment – dstarh Feb 10 '21 at 22:04

1 Answers1

0

You could try EdgeNgramTokenFilter, use it on a analyzer applied on the desired field and do a fuzzy search on it.

Andre85
  • 469
  • 4
  • 10