1

Even though I've seen many accounts of it mentioning this as relatively straightforward, I haven't managed to see it working properly. Let's say I have this:

class Car < ActiveRecord::Base
  settings analysis: {
    filter: {
      ngram_filter: { type: "nGram", min_gram: 3, max_gram: 12 }
    },
    analyzer: {
      partial_analyzer: {
        type: "snowball",
        tokenizer: "standard",
        filter: ["standard", "lowercase", "ngram_filter"]
      }
    }
  } do
    mapping do
      indexes :name,                    index_analyzer: "partial_analyzer"
    end
  end
end

And let's say I have a car named "Ford" and I update my index. Now, if I search for "Ford":

Car.tire.search { query { string "Ford" } }

My car is in my results. Now, If I look for "For":

Car.tire.search { query { string "For" } }

My car isn't found anymore. I thought the nGram filter would automatically take care of it for me, but apparently it isn't. As a temporary solution I'm using the wildcard (*) for such searches, but this is definitely not the best approach, being the min_gram and max_gram definitions key elements in my search. Can anyone tell me how they solved this?

I'm using Rails 3.2.12 with ruby 1.9.3 . ElasticSearch version is 0.20.5.

ChuckE
  • 5,610
  • 4
  • 31
  • 59

1 Answers1

5

You want to use the custom analyzer instead of the snowball one: Elasticsearch custom analyzer

Basically the other analyzers come with a predefined set of filters and tokenizers.

You probably also want to use the Edge-Ngram filter: Edge-Ngram filter

The difference between Edge-NGram and NGram is basically Edge-Ngram basically only sticking to the "edges" of a term. So it starts at the front or at the back. Ford -> [For] instead of -> [For, ord]

Some more advanced links on the topic of autocompletion:

Autocompletion with fuzziness (pure elasticsearch, no tire, but very good read)

Another useful question with links provided

Edit

Basically I have a very similar setup to what you have. But with another analyzer for title and multi-field for both. And because of multi-language support here is an array of names instead of just a name.

I also specify the search_analyzer and I use string-keys instead of symbols. This is what I actually have:

settings "analysis" => {
    "filter" => {
        "name_ngrams"  => {
            "side"     => "front",
            "max_gram" => 20,
            "min_gram" => 2,
            "type"     => "edgeNGram"
        }
    },
    "analyzer" => {
        "full_name"     => {
            "filter"    => %w(standard lowercase asciifolding),
            "type"      => "custom",
            "tokenizer" => "letter"
        },
        "partial_name"        => {
            "filter"    => %w(standard lowercase asciifolding name_ngrams),
            "type"      => "custom",
            "tokenizer" => "standard"
        }
    }
} do
  mapping do
    indexes :names do
      mapping do
        indexes :name, :type => 'multi_field',
                :fields => {
                    "partial"           => {
                        "search_analyzer" => "full_name",
                        "index_analyzer"  => "partial_name",
                        "type"            => "string"
                    },
                    "title"      => {
                        "type"     => "string",
                        "analyzer" => "full_name"
                    }
                }
      end
    end
  end
end
Community
  • 1
  • 1
Milan Köpke
  • 1,133
  • 7
  • 8
  • I've used your your definitions (custom analyzer, edge-n-gram filter) and produced the same results: "For" returns nothing, "Ford" returns everything. According to the documentation it should be working as you say, I just can't figure out why it isn't. Are you using Ruby/Tire? – ChuckE Mar 12 '13 at 16:55
  • Yes, I am using tire and ruby. Did you reindex your data with rake tire:import CLASS='Car' FORCE=true ? – Milan Köpke Mar 13 '13 at 09:23
  • I did. Both using the rake task and delete/create/import directly in the console. – ChuckE Mar 13 '13 at 14:55
  • thx a lot man, I don't have time today to validate it (have some stuff to do) but I'll come back to the topic tomorrow, I hope. – ChuckE Mar 13 '13 at 16:07