12

I try to make an autocomplete function with angularjs and elasticsearch on a given field, for example countryname. it can contain simple names like "France", "Spain" or "composed names" like "Sierra Leone".

In the mapping this field is not_analyzed to prevent elastic to tokenize "composed names"

"COUNTRYNAME" : {"type" : "string", "store" : "yes","index": "not_analyzed" }

I need to query elasticsearch:

  • to filter the document with something like "countryname:value" where value can contain wildcard
  • and make an aggregation on the countryname returned by the filter, ( i do aggregation to get only distinct data, the count is useless for me her, maybe there is a better solution)

I can't use wildcard with the "not_analyzed" field :

this is my query but wildcard in "value" variable doesn't work and it's case sensitive :

The wildcard alone her work :

curl -XGET 'local_host:9200/botanic/specimens/_search?size=0' -d '{
  "fields": [
    "COUNTRYNAME"
  ],
  "query": {
    "query_string": {
      "query": "COUNTRYNAME:*"
    }
  },
  "aggs": {
    "general": {
      "terms": {
        "field": "COUNTRYNAME",
        "size": 0
      }
    }
  }
}'

but this doesn't work (franc*) :

curl -XGET 'local_host:9200/botanic/specimens/_search?size=0' -d '{
  "fields": [
    "COUNTRYNAME"
  ],
  "query": {
    "query_string": {
      "query": "COUNTRYNAME:Franc*"
    }
  },
  "aggs": {
    "general": {
      "terms": {
        "field": "COUNTRYNAME",
        "size": 0
      }
    }
  }
}'

I tried also with bool must query but don't work with this not_analyzed field and wildcard :

curl -XGET 'local_host:9200/botanic/specimens/_search?size=0' -d '{
  "fields": [
    "COUNTRYNAME"
  ],
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "COUNTRYNAME": "Franc*"
          }
        }
      ]
    }
  },
  "aggs": {
    "general": {
      "terms": {
        "field": "COUNTRYNAME",
        "size": 0
      }
    }
  }
}'

What I'm missing or doing wrong? should I left the field analyzed in the mapping and use another analyser who don't split composed name into token??

Saeed Zhiany
  • 2,051
  • 9
  • 30
  • 41
AlainIb
  • 4,544
  • 4
  • 38
  • 64

1 Answers1

24

i found a working solution : the "keyword" tokenizer. create a custom analyzer and use it in the mapping for the field i want to keep without split by space :

    curl -XPUT 'localhost:9200/botanic/' -d '{
 "settings":{
     "index":{
        "analysis":{
           "analyzer":{
              "keylower":{
                 "tokenizer":"keyword",
                 "filter":"lowercase"
              }
           }
        }
     }
  },
  "mappings":{
        "specimens" : {
            "_all" : {"enabled" : true},
            "_index" : {"enabled" : true},
            "_id" : {"index": "not_analyzed", "store" : false},
            "properties" : {
                "_id" : {"type" : "string", "store" : "no","index": "not_analyzed"  } ,
            ...
                "LOCATIONID" : {"type" : "string",  "store" : "yes","index": "not_analyzed" } ,
                "AVERAGEALTITUDEROUNDED" : {"type" : "string",  "store" : "yes","index": "analyzed" } ,
                "CONTINENT" : {"type" : "string","analyzer":"keylower" } ,
                "COUNTRYNAME" : {"type" : "string","analyzer":"keylower" } ,                
                "COUNTRYCODE" : {"type" : "string", "store" : "yes","index": "analyzed" } ,
                "COUNTY" : {"type" : "string","analyzer":"keylower" } ,
                "LOCALITY" : {"type" : "string","analyzer":"keylower" }                 
            }
        }
    }
}'

so i can use wildcard in query on the field COUNTRYNAME, who is not splitted :

curl -XGET 'localhost:9200/botanic/specimens/_search?size=10' -d '{
"fields"  : ["COUNTRYNAME"],     
"query": {"query_string" : {
                    "query": "COUNTRYNAME:bol*"
}},
"aggs" : {
    "general" : {
        "terms" : {
            "field" : "COUNTRYNAME", "size":0
        }
    }
}}'

the result :

{
    "took" : 14,
    "timed_out" : false,
    "_shards" : {
        "total" : 5,
        "successful" : 5,
        "failed" : 0
    },
    "hits" : {
        "total" : 45,
        "max_score" : 1.0,
        "hits" : [{
                "_index" : "botanic",
                "_type" : "specimens",
                "_id" : "91E7B53B61DF4E76BF70C780315A5DFD",
                "_score" : 1.0,
                "fields" : {
                    "COUNTRYNAME" : ["Bolivia, Plurinational State of"]
                }
            }, {
                "_index" : "botanic",
                "_type" : "specimens",
                "_id" : "7D811B5D08FF4F17BA174A3D294B5986",
                "_score" : 1.0,
                "fields" : {
                    "COUNTRYNAME" : ["Bolivia, Plurinational State of"]
                }
            } ...
        ]
    },
    "aggregations" : {
        "general" : {
            "buckets" : [{
                    "key" : "bolivia, plurinational state of",
                    "doc_count" : 45
                }
            ]
        }
    }
}
AlainIb
  • 4,544
  • 4
  • 38
  • 64
  • Thanks @Anainlb for solution. This worked for my similar case. – hemu Aug 12 '15 at 15:35
  • glad it help. feel free to share any improvement ;) – AlainIb Aug 13 '15 at 07:29
  • @AlainIb It was my fault as you can see the `ng-model` should be inside `` tag but it was inside ` – kittu Dec 26 '15 at 21:12
  • @Satyadev i think you post in the wrong question no ? – AlainIb Feb 24 '16 at 07:05
  • 1
    Awesome! Thanks. I was in the ES IRC and it is very hard to get help. This posted saved me a lot of time waiting around hoping someone would help me! Thanks!!!! – Nate Uni Jun 10 '16 at 01:22
  • glade it help. if you have improvement be free to share ;) – AlainIb Jun 10 '16 at 07:26
  • you're a lifesaver – gilm Jan 16 '17 at 12:32
  • facing below problem .. https://stackoverflow.com/questions/54121646/elasticsearch-exceptions-requesterror-requesterror400-mapper-parsing-excepti – Monu Jan 23 '20 at 14:56