3

I've got an index with a field description which is analysed like that:

"description":{
      "analyzer" : "english",
      "type" : "string"
}

I have defined an synonyms dictionnary in a file synonyms.txt which contain:

ipod, i-pod, i pod => i-pod

I would like to add this synonym dictionnary to my analyzer, but I don't know how to do it. Should I define a custom analyzer? But if I do so will I diverge from my current indexation due to this customisation.index

mel
  • 2,730
  • 8
  • 35
  • 70

1 Answers1

9

Yes, you should define a custom analyzer. You can start with the standard English analyzer, and add your synonymfilter to that:

{
  "settings": {
    "analysis": {
      "filter": {
        "english_stop": {
          "type":       "stop",
          "stopwords":  "_english_" 
        },
        "english_keywords": {
          "type":       "keyword_marker",
          "keywords":   [] 
        },
        "english_stemmer": {
          "type":       "stemmer",
          "language":   "english"
        },
        "english_possessive_stemmer": {
          "type":       "stemmer",
          "language":   "possessive_english"
        },
        "my_synonyms" : {
          "type" : "synonym",
          "synonyms_path" : "path/to/synonym.txt"
        }
      },
      "analyzer": {
        "custom_english": {
          "tokenizer":  "standard",
          "filter": [
            "english_possessive_stemmer",
            "lowercase",
            "my_synonyms",
            "english_stop",
            "english_keywords",
            "english_stemmer"
          ]
        }
      }
    }
  }
}

As far as whether it will diverge, yes. If you are applying your synonyms as index time, newly indexed data will have the synonym filter applied, existing data will not. If you want changes to index-time analysis to apply consistently, you need to re-index the data.

If the change to analysis will only only be in your search_analyzer, then there is no need to re-index.

femtoRgon
  • 32,893
  • 7
  • 60
  • 87