0

I have a fuzzy search analyzer in elastic search with following documents

PUT test_index
{
  "settings": {
    "index": {
      "max_ngram_diff": 40      
    },
    "analysis": {
      "analyzer": {
        "autocomplete": {
          "tokenizer": "whitespace",
          "filter": [
            "lowercase",
            "autocomplete"
          ]
        },
        "autocomplete_search": {
          "tokenizer": "whitespace",
          "filter": [
            "lowercase"
          ]
        }
      },
      "filter": {
        "autocomplete": {
          "type": "ngram",        
          "min_gram": 2,
          "max_gram": 40
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "title": {
        "type": "text",            
        "analyzer": "autocomplete",
        "search_analyzer": "autocomplete_search"
      }
    }
  }
}

PUT test_index/_doc/1
{ "title": "HRT 2018-BN18 N-SB" }

PUT test_index/_doc/2
{ "title": "GMC 2019-BN18 A-SB" }

How can i ignore the hyphen ('-') during my fuzzy search so that GMC 2019-BN18 A-SB , gmc 2019, gmc 2019-BN18 A-SB and GMC 2019-BN18 ASB yield the same document

I had tried to create another analyzer separately but i am not sure how can we apply multiple analyzer on the same field

"settings": {
    "analysis": {
      "analyzer": {
        "my_analyzer": {
          "tokenizer": "standard",
          "char_filter": [
            "my_char_filter"
          ]
        }
      },
      "char_filter": {
        "my_char_filter": {
          "type": "mapping",
          "mappings": [
            "- => "
          ]
        }
      }
    }
  }
Val
  • 207,596
  • 13
  • 358
  • 360
amrit
  • 315
  • 1
  • 2
  • 11

1 Answers1

0

You're on the right path, you just need to add that character filter to both analyzers to make sure the hyphens get removed at indexing and search time:

PUT test_index
{
  "settings": {
    "index": {
      "max_ngram_diff": 40
    },
    "analysis": {
      "char_filter": {
        "my_char_filter": {
          "type": "mapping",
          "mappings": [
            "- => "
          ]
        }
      },
      "analyzer": {
        "autocomplete": {
          "char_filter": [
            "my_char_filter"
          ],
          "tokenizer": "whitespace",
          "filter": [
            "lowercase",
            "autocomplete"
          ]
        },
        "autocomplete_search": {
          "char_filter": [
            "my_char_filter"
          ],
          "tokenizer": "whitespace",
          "filter": [
            "lowercase"
          ]
        }
      },
      "filter": {
        "autocomplete": {
          "type": "ngram",
          "min_gram": 2,
          "max_gram": 40
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "analyzer": "autocomplete",
        "search_analyzer": "autocomplete_search"
      }
    }
  }
}
Val
  • 207,596
  • 13
  • 358
  • 360