I'm trying to use ElasticSearch for partial matches on multiple fields using NGram, but I'm matching 0 results after I build the index. This is not coming very naturally to me, and I can't seem to even get NGram working for even one field. This is a passion project for me, and I really want the new search working for partial word matches. I tried using fuzziness but it started scoring incorrect matches too high.
Index Create:
var nGramFilters = new List<string> { "lowercase", "asciifolding", "nGram_filter" };
Client.Indices.Create(CurrentIndexName, c => c
.Settings(st => st
.Analysis(an => an // https://stackoverflow.com/questions/38065966/token-chars-mapping-to-ngram-filter-elasticsearch-nest
.Analyzers(anz => anz
.Custom("ngram_analyzer", cc => cc
.Tokenizer("ngram_tokenizer")
.Filters(nGramFilters))
)
.Tokenizers(tz => tz
.NGram("ngram_tokenizer", td => td
.MinGram(2)
.MaxGram(20)
.TokenChars(
TokenChar.Letter,
TokenChar.Digit,
TokenChar.Punctuation,
TokenChar.Symbol
)
)
)
)
)
.Map<Package>(map => map
.AutoMap()
.Properties(p => p
.Text(t => t
.Name(n => n.Title)
.Fields(f => f
.Keyword(k => k
.Name("keyword")
.IgnoreAbove(256)
)
.Text(tt => tt
.Name("ngram")
.Analyzer("ngram_analyzer")
)
)
)
.Text(t => t
.Name(n => n.Summary)
.Fields(f => f
.Keyword(k => k
.Name("keyword")
.IgnoreAbove(256)
)
.Text(tt => tt
.Name("ngram")
.Analyzer("ngram_analyzer")
)
)
)
.Text(t => t
.Name(n => n.PestControlledBy)
.Fields(f => f
.Keyword(k => k
.Name("keyword")
.IgnoreAbove(256)
)
.Text(tt => tt
.Name("ngram")
.Analyzer("ngram_analyzer")
)
)
)
.Text(t => t
.Name(n => n.PesticideControlsThesePests)
.Fields(f => f
.Keyword(k => k
.Name("keyword")
.IgnoreAbove(256)
)
.Text(tt => tt
.Name("ngram")
.Analyzer("ngram_analyzer")
)
)
)
.Text(t => t
.Name(n => n.PesticideInstructions)
.Fields(f => f
.Keyword(k => k
.Name("keyword")
.IgnoreAbove(256)
)
.Text(tt => tt
.Name("ngram")
.Analyzer("ngram_analyzer")
)
)
)
.Text(t => t
.Name(n => n.PesticideActiveIngredients)
.Fields(f => f
.Keyword(k => k
.Name("keyword")
.IgnoreAbove(256)
)
.Text(tt => tt
.Name("ngram")
.Analyzer("ngram_analyzer")
)
)
)
.Text(t => t
.Name(n => n.PesticidesContainingThisActiveIngredient)
.Fields(f => f
.Keyword(k => k
.Name("keyword")
.IgnoreAbove(256)
)
.Text(tt => tt
.Name("ngram")
.Analyzer("ngram_analyzer")
)
)
)
.Text(t => t
.Name(n => n.PesticideSafeOn)
.Fields(f => f
.Keyword(k => k
.Name("keyword")
.IgnoreAbove(256)
)
.Text(tt => tt
.Name("ngram")
.Analyzer("ngram_analyzer")
)
)
)
.Text(t => t
.Name(n => n.PesticideNotSafeOn)
.Fields(f => f
.Keyword(k => k
.Name("keyword")
.IgnoreAbove(256)
)
.Text(tt => tt
.Name("ngram")
.Analyzer("ngram_analyzer")
)
)
)
)
)
);
Query:
var result = _client.Search<Package>(s => s
.From((form.Page - 1) * form.PageSize)
.Size(form.PageSize)
.Query(query => query
.MultiMatch(m => m
.Fields(f => f
.Field(p => p.Title.Suffix("ngram"), 1.5)
.Field(p => p.Summary.Suffix("ngram"), 1.1)
.Field(p => p.PestControlledBy.Suffix("ngram"), 1.0)
.Field(p => p.PesticideControlsThesePests.Suffix("ngram"), 1.0)
.Field(p => p.PesticideInstructions.Suffix("ngram"), 1.0)
.Field(p => p.PesticideActiveIngredients.Suffix("ngram"), 1.0)
.Field(p => p.PesticidesContainingThisActiveIngredient.Suffix("ngram"), 1.0)
.Field(p => p.PesticideSafeOn.Suffix("ngram"), 1.0)
.Field(p => p.PesticideNotSafeOn.Suffix("ngram"), 1.0)
)
.Operator(Operator.Or) // https://stackoverflow.com/questions/46139028/elasticsearch-how-to-do-a-partial-match-from-your-query
.Query(form.Query)
)
)
.Highlight(h => h
.PreTags("<strong>")
.PostTags("</strong>")
.Encoder(HighlighterEncoder.Html) //https://github.com/elastic/elasticsearch-net/issues/3091
.Fields(fs => fs
.Field(f => f.Summary.Suffix("ngram")),
fs => fs
.Field(p => p.PestControlledBy.Suffix("ngram")),
fs => fs
.Field(p => p.PesticideControlsThesePests.Suffix("ngram")),
fs => fs
.Field(p => p.PesticideInstructions.Suffix("ngram")),
fs => fs
.Field(p => p.PesticideActiveIngredients.Suffix("ngram")),
fs => fs
.Field(p => p.PesticidesContainingThisActiveIngredient.Suffix("ngram")),
fs => fs
.Field(p => p.PesticideSafeOn.Suffix("ngram")),
fs => fs
.Field(p => p.PesticideNotSafeOn.Suffix("ngram"))
.NumberOfFragments(10)
.FragmentSize(250)
)
)
);
Am I even in the right ballpark? I tried using the default analyzer, but I don't match "cat dandelion" for "cat's ear dandelion" and things like that. With the default analyzer... the whole word has to match, but I want partial matches working to get things like "petal" and "petals". Any step in the right direction is appreciated. I'm completely new to ElasticSearch and NEST and have only been working with it for a week or so now.