Elasticsearch - nGram on documents but not the search terms

Question

I apparently misunderstood how nGram works with Elasticsearch. I wanted to be able to efficiently search for a substring. That way I could type 'loud' and still find words like 'clouds'. I have my nGram tokenizer set up to have min=2 and max=10.

Apparently, nGram splits up the search term ('loud') into 'lo', 'ou', 'ud', 'lou', 'oud' and 'loud'. In some cases this is nice because it will find 'louder' if I search for 'cloud'. However, I think generally it just confuses my users.

Is there a way to prevent Elasticsearch from splitting up the search term? I tried using quotes in the querystring but that doesn't seem to work.

score 1 · Accepted Answer · answered May 30 '14 at 12:16

1

You should specify 2 separate analyzers for index and for search in your mapping, called index_analyzer and search_analyzer. Index analyzer is the same, as search analyzer, but with nGram filter added.

answered May 30 '14 at 12:16

Alex

1,210
8
15

I realized last night that all I had to do was use a standard analyzer for the search analyzer. I was using the same analyzer for both indexing and searching, which was causing my problem. You nailed it right on the head. – Travis Parks May 30 '14 at 13:09
hope you find here something interesting for yourself http://euphonious-intuition.com/2012/08/more-complicated-mapping-in-elasticsearch/ – Alex May 30 '14 at 13:29

Elasticsearch - nGram on documents but not the search terms

1 Answers1