0

I apparently misunderstood how nGram works with Elasticsearch. I wanted to be able to efficiently search for a substring. That way I could type 'loud' and still find words like 'clouds'. I have my nGram tokenizer set up to have min=2 and max=10.

Apparently, nGram splits up the search term ('loud') into 'lo', 'ou', 'ud', 'lou', 'oud' and 'loud'. In some cases this is nice because it will find 'louder' if I search for 'cloud'. However, I think generally it just confuses my users.

Is there a way to prevent Elasticsearch from splitting up the search term? I tried using quotes in the querystring but that doesn't seem to work.

Travis Parks
  • 8,435
  • 12
  • 52
  • 85

1 Answers1

1

You should specify 2 separate analyzers for index and for search in your mapping, called index_analyzer and search_analyzer. Index analyzer is the same, as search analyzer, but with nGram filter added.

Alex
  • 1,210
  • 8
  • 15
  • I realized last night that all I had to do was use a standard analyzer for the search analyzer. I was using the same analyzer for both indexing and searching, which was causing my problem. You nailed it right on the head. – Travis Parks May 30 '14 at 13:09
  • hope you find here something interesting for yourself http://euphonious-intuition.com/2012/08/more-complicated-mapping-in-elasticsearch/ – Alex May 30 '14 at 13:29