3

I am new to elasticsearch so before downvoting or marking as duplicate, please read the question first.

I am testing synonyms in elasticsearch (v 2.4.6) which I have installed on Ubuntu 16.04. I am giving synonyms through a file named synonym.txt which I have placed in config directory. I have created an index synonym_test as follows-

curl -XPOST localhost:9200/synonym_test/ -d '{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_synonyms": {
          "tokenizer": "whitespace",
            "filter": ["lowercase","my_synonym_filter"] 
            }
         },
      "filter": {
        "my_synonym_filter": {
          "type": "synonym", 
            "ignore_case": true,
              "synonyms_path" : "synonym.txt"
              }
          }
      }
   }
}'

The index contains two fields- id and some_text. I configure the field some_text with the custom analyzer as follows-

curl -XPUT localhost:9200/synonym_test/rulers/_mapping -d '{
  "properties": {
    "id": {
      "type": "double"
      },
    "some_text": {
      "type": "string",
          "search_analyzer": "my_synonyms"       
          }
      }
 }'

Then I have inserted some data as -

curl -XPUT localhost:9200/synonym_test/external/5 -d '{
  "id" : "5",
  "some_text":"apple is a fruit"
}'
curl -XPUT localhost:9200/synonym_test/external/7 -d '{
  "id" : "7",
  "some_text":"english is spoken in england"
}'
curl -XPUT localhost:9200/synonym_test/external/8 -d '{
  "id" : "8",
  "some_text":"Scotland Yard is a popular game."
}'
curl -XPUT localhost:9200/synonym_test/external/9 -d '{
  "id" : "9",
  "some_text":"bananas contain potassium"
}'

The synonym.txt file contains following-

"britain,england,scotland"
"fruit,bananas"

After doing all this, when I run the query for term fruit (which should also return the text containing bananas as they are synonyms in file), I get the text containing fruit only.

{
  "took":117,
   "timed_out":false,
   "_shards":{  
      "total":5,
      "successful":5,
      "failed":0
   },
   "hits":{  
      "total":1,
      "max_score":0.8465736,
      "hits":[  
         {  
            "_index":"synonym_test",
            "_type":"external",
            "_id":"5",
            "_score":0.8465736,
            "_source":{  
               "id":"5",
               "some_text":"apple is a fruit"
            }
         }
      ]
   }
}

I have also tried the following links, but none seem to have helped me - Synonym analyzer not working , Elasticsearch synonym analyzer not working , How to apply synonyms at query time instead of index time in Elasticsearch , how to configure the synonyms_path in elasticsearch and many other links.

So, can anyone please tell me if I am doing anything wrong? Is there anything wrong with the settings or synonym file? I want the synonyms to work (query time) so that when I search for a term, I get all documents related to that term.

Indent
  • 4,675
  • 1
  • 19
  • 35
Ankit Seth
  • 729
  • 1
  • 9
  • 23

1 Answers1

0

Please refer to following url: Custom Analyzer on how you should configure custom analyzers. If we follow the guides from above documentation our schema will be as follows:

curl -XPOST localhost:9200/synonym_test/ -d '{
  "settings": {
"analysis": {
  "analyzer": {
    "type": "custom"
    "my_synonyms": {
      "tokenizer": "whitespace",
        "filter": ["lowercase","my_synonym_filter"] 
        }
     },
  "filter": {
    "my_synonym_filter": {
      "type": "synonym", 
        "ignore_case": true,
          "synonyms_path" : "synonym.txt"
          }
      }
  }
  }
}

Which currently works on my elasticsearch instance.

JFC
  • 328
  • 4
  • 15