0

i am planning to make an elastic search based auto complete module for an e commerce website.i am using edge_ngram for suggestions.I am trying out this configuration.

**My index creation :**

PUT my_index
{
  "settings": {
    "analysis": {
      "analyzer": {
        "autocomplete": {
          "tokenizer": "autocomplete",
          "filter": [
            "lowercase"
          ]
        },
        "autocomplete_search": {
          "tokenizer": "lowercase"
        }
      },
      "tokenizer": {
        "autocomplete": {
          "type": "edge_ngram",
          "min_gram": 1,
          "max_gram": 10,
          "token_chars": [
            "letter","digit"
          ]
        }
      }
    }
  },
  "mappings": {
    "doc": {
      "properties": {
        "title": {
          "type": "text",
          "analyzer": "autocomplete",
          "search_analyzer": "autocomplete_search"
        }
      }
    }
  }
}

**Inserting Data**

PUT my_index/doc/1
{
  "title": "iphone s" 
}

PUT my_index/doc/9
{
  "title": "iphone ka" 
}

PUT my_index/doc/11
{
  "title": "iphone ka t" 
}

PUT my_index/doc/15
{
  "title": "iphone 6" 
}

PUT my_index/doc/14
{
  "title": "iphone 6 16GB" 
}

PUT my_index/doc/3
{
  "title": "iphone k" 
}

POST my_index/_refresh

POST my_index/_analyze
{
  "tokenizer": "autocomplete",
  "text": "iphone 6"
}

POST my_index/_analyze
{
  "analyzer": "pattern",
  "text": "iphone 6"
}

**Autocomplete suggestions**
When i am trying to find out closets match to iphone 6.It is not showing correct result.

GET my_index/_search
{
  "query": {
    "match": {
      "title": {
        "query": "iphone 6", 
        "operator": "and"
      }
    }
  }
}


**Above query yielding :**
{
  "took": 0,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 7,
    "max_score": 0.28582606,
    "hits": [
      {
        "_index": "my_index",
        "_type": "doc",
        "_id": "1",
        "_score": 0.28582606,
        "_source": {
          "title": "iphone s"
        }
      },
      {
        "_index": "my_index",
        "_type": "doc",
        "_id": "9",
        "_score": 0.25811607,
        "_source": {
          "title": "iphone ka"
        }
      },
      {
        "_index": "my_index",
        "_type": "doc",
        "_id": "14",
        "_score": 0.24257512,
        "_source": {
          "title": "iphone 6 16GB"
        }
      },
      {
        "_index": "my_index",
        "_type": "doc",
        "_id": "3",
        "_score": 0.19100356,
        "_source": {
          "title": "iphone k"
        }
      },
      {
        "_index": "my_index",
        "_type": "doc",
        "_id": "15",
        "_score": 0.1862728,
        "_source": {
          "title": "iphone 6"
        }
      },
      {
        "_index": "my_index",
        "_type": "doc",
        "_id": "11",
        "_score": 0.16358379,
        "_source": {
          "title": "iphone ka t"
        }
      },
      {
        "_index": "my_index",
        "_type": "doc",
        "_id": "2",
        "_score": 0.15861572,
        "_source": {
          "title": "iphone 5 s"
        }
      }
    ]
  }
}

But result should be :

     {
        "_index": "my_index",
        "_type": "doc",
        "_id": "15",
        "_score": 1,
        "_source": {
          "title": "iphone 6"
        }
      }

Please let me know if i am missing something on this,I am new to this so not aware of any other method that may yield better results.

RedHead_121
  • 55
  • 1
  • 6

1 Answers1

1

You are using autocomplete_search as your search_analyzer. If you look how your text is analyzed using search analyzer specified by you.

POST my_index/_analyze
{
 "analyzer": "autocomplete_search",
 "text": "iphone 6"
}

You will get

 {
 "tokens": [
  {
     "token": "iphone",           ===> Only one token
     "start_offset": 0,
     "end_offset": 6,
     "type": "word",
     "position": 0
     }
   ]
 }

Since all the documents have this (iphone) token in reverse index. So all the documents are returned

In case you want to match desired results, you can use the same analyzer used while indexing.

{
 "query": {
 "match": {
  "title": {
    "query": "iphone 6", 
    "operator": "and",
    "analyzer" : "autocomplete"
   }
  } 
 }
}
Richa
  • 7,419
  • 6
  • 25
  • 34
  • Can you please let us know how to obtain the desired result? – RedHead_121 May 11 '17 at 10:40
  • Thanks for the help.Furthermore i require to use phrase suggester over this.Can i do this ? i want to correct if user types ipone five and suggest iphone 5. – RedHead_121 May 14 '17 at 06:00
  • You will have to change your `analyzer`. Some analyzer that will index five as 5. – Richa May 14 '17 at 06:05
  • Lets say for now i don't consider this case(ipone five) .Let say user searched ipone 5 will this yield required result. I used this to search. "suggest" : { "DidYouMean": { "text": "iphne 5", "phrase" : { "analyzer" : "autocomplete", "field" : "name", "highlight": { "pre_tag": "", "post_tag": "" } } }} – RedHead_121 May 14 '17 at 06:33
  • Getting weird output thats because of analyser autocomplete i think. { "text": "i ip iph ipho iphone 5", "highlighted": "i ip iph ipho iphone 5", "score": 4.105417e-7 }, { "text": "i ip iph iphon iphone 5", "highlighted": "i ip iph iphon iphone 5", "score": 4.105417e-7 }, – RedHead_121 May 14 '17 at 06:34
  • Hi,Can you help me out on this as well? that will be a great help. http://stackoverflow.com/questions/43973192/mappings-on-filed-elastic-search – RedHead_121 May 15 '17 at 06:58