0

I'm pretty new in ElasticSearch. I have tried most tutorial and looked at forum but I can't find a good solution. For the workaround, I'm feeding using R and elastic package and Elastic API is bridged using Laravel/PHP.

I'm trying to create a geocoding index with all addresses in France in order to :

1) autocomplete address

2) geocode address

After many tests I came to choose nGram because to many problems in handling combined text and digits request with others or I didn't had the expected behavior or results.

My problem is that completion fails for long request or is not tolerant enough.

Let's say that in the autocompletion I want to target "11, rue de douai 75009 Paris".

I'll have it with following requests :

11, rue de d
rue de douai

But following requests will fails having results :

11 douai

11, rue de do

rue de douai 75

rue de douai 11

for 11 rue du faubourg poissonière

11 rue du works 11 rue du f does not work no result

rue du faubourg works rue du faubourg p does not work no result

faubourg poissioner works faubourg poissionere does not work no result

My index config is as follow



    "settings": {
        "analysis": {
          "analyzer": {
            "completion_analyzer": {
              "type": "custom",
              "filter": [
                "lowercase",
                "asciifolding",
                "trim",
                "completion_filter"
              ],
              "tokenizer": "keyword"
            }
          },
          "filter": {
            "completion_filter": {
            "type": "nGram",
            "min_gram": 2,
            "max_gram": 20,
            "token_chars": [ "letter", "digit", "punctuation" ]
          }
        }
      }
    },
    "mappings": {
      "geocoding": {
        "properties": {
          "numero": {
            "type": "long"
          },
          "nom_voie": {
            "type": "text"
          },
          "ville": {
            "type": "text"
          },
          "code_postal": {
            "type": "text"
          },
          "code_insee": {
            "type": "text"
          },
          "lon": {
            "type": "float"
          },
          "lat": {
            "type": "float"
          },
          "full_address": {
            "type": "text"
          },
          "address_suggest": {
            "type": "completion",
            "max_input_length" : 150,
            "analyzer": "completion_analyzer",
            "search_analyzer": "standard",
            "preserve_position_increments": false
          }
        }
      }
    }
    }

I inserted data as follow :


{
    "numero" : 11,
    "nom_voie" : "rue du faubourg poissonière",
    "code_postal" : "75008",
    "code_insee" : "75108",
    "ville" : "PARIS",
    "lon" : 2.37352,
    "lat" : 48.85759,
    "full_address" : "11, rue du faubourg poissonière 75008 PARIS",
    "address_suggest" : "11 rue du faubourg poissonière 75008 PARIS",
    "weight" : 2,
}

Request is made as follow :


{
    "_source" : "full_address",
    "suggest" : {
        "text" : query,
        "completion" : {
            "field" : "address_suggest",
            "size" : 5,
            "skip_duplicates" : TRUE,
            "fuzzy" : {
                "fuzziness" : 5
            }
        }
    }
}
coder
  • 8,346
  • 16
  • 39
  • 53
user1998000
  • 175
  • 1
  • 10

2 Answers2

0

It's not totally clear from the documentation, but I believe the completion suggester will only help you complete phrases or sentences from the beginning of a field. So, using the completion suggester, you would have to begin the query with 11 rue... to match that particular document.

I tried out a few of the built-in suggesters, but the completion suggester forced users to begin with the correct word/term while the term and phrase suggesters were helpful for correcting misspellings on one or more words, but never returned the entire field that they matched.

I ended up just using a normal "match" query (not using suggesters at all) against the field I wanted suggestions for, and found that to be the best solution. Now, users will get matches from anywhere in the field, and I can display the entire field as the suggestion.

Using your field names, the query would look like this:

{
  "from": 0,
  "size": 5,
  "_source": [
    "full_address"
  ],
  "query": {
    "match": {
      "full_address": {
        "query": query,
        "fuzziness": 5,
        "operator": "and"
      }
    }
  }
}

I'm pretty new to Elasticsearch myself, so I'll defer to someone more experienced, in case I was just using the suggesters wrong. But I followed the documentation word-for-word and couldn't get any of them to return the whole matched field, with matches allowed anywhere in the field.

dmbaughman
  • 578
  • 2
  • 7
0

thank you for your answer. Working around with it I kinda a came back to query instead of completion too eventhough it's not the behavior I totally want.

I would like something as smooth as deliveroo can do for example. Not yet achieved !

user1998000
  • 175
  • 1
  • 10