2

I have a Company type I've created. Inside of that Company type I have a field called "Summary". How can I add multiple index analyzers to this field?

I briefly looked into using the Yakaz plugin, but it doesn't appear I can use that with NEST.

The reasoning behind this is that sometimes users will search for company names with a period in their query, other times they won't include the period. I'd like to do a partial match using ngrams on both the company name with and without punctuation. I'm currently using a stopwords filter to remove punctuation.

Properties of the Summary field(Having multiple Index analyzers throws an error):

[ElasticProperty(IndexAnalyzer = "partial_match", IndexAnalyzer = "partial_match_no_punctuation", SearchAnalyzer = "full_match")]
public string Summary { get; set; }

Mapping:

private static void CreateMapping(ElasticClient client)
{
    var partialMatchNoPunctuation = new CustomAnalyzer
    {
        Filter = new List<string> { "standard", "lowercase", "asciifolding", "punctuation_filter", "name_ngrams" },  //Apply all filters before ngram
        Tokenizer = "standard"
    };
    var partialMatch = new CustomAnalyzer
    {
        Filter = new List<string> { "standard", "lowercase", "asciifolding", "name_ngrams" },  //Apply all filters before ngram
        Tokenizer = "standard"
    };

    var fullMatch = new CustomAnalyzer
    {
        Filter = new List<string> { "standard", "lowercase", "asciifolding" },
        Tokenizer = "standard"
    };

    client.CreateIndex(Settings.Default.IndexName, c => c
        .Analysis(descriptor => descriptor
            .TokenFilters(bases => bases
                .Add("name_ngrams", new NgramTokenFilter
                {
                    MaxGram = 11,
                    MinGram = 3
                })
                .Add("punctuation_filter", new StopTokenFilter
                {
                    Stopwords = new List<string> {"."}
                })
                )
            .Analyzers(bases => bases
                .Add("partial_match", partialMatch)
                .Add("partial_match_no_punctuation", partialMatchNoPunctuation)
                .Add("full_match", fullMatch))
        )
    );
}

Alternatively if there's a way to do this in a single analyzer I'm open to suggestions.

EDIT:

My class name is "ElasticSearchProject". I'd like it to be stored as a type called "Project". I believe my attempt at this is what is causing the errors. When I get the mapping for type Project, it only has the partial match analyzer applied to it.

This is the only ES property still applied to my class:

[ElasticType(Name = "Project")]

Multi-field mapping:

.AddMapping<ElasticSearchProject>(m => m
    .MapFromAttributes()
    .Properties(project=>project
        .MultiField(mf=>mf
            .Name("Project")
            .Fields(f=>f
                .Number(s=>s.Name(o=>o.Id).Index(NonStringIndexOption.no))
                .String(s => s.Name(o => o.Summary).IndexAnalyzer("partial_match"))
                .String(s => s.Name(o => o.Summary).IndexAnalyzer("partial_match_no_punctuation"))
            ))))
Brandon
  • 1,058
  • 1
  • 18
  • 42

1 Answers1

5

First, to answer your question, you cannot add multiple analyzers to a single field. However, you can use the multi field type to map multiple versions of the same field, and apply a different analyzer to each of them. Checkout this answer for how to accomplish this with NEST.

In regards to searching with and without punctuation, if you use the same analyzer as your index and search analyzer, then it won`t matter because the same analysis that was applied to the field during indexing will also be applied to the users query.

Example:

Foo.Bar will be indexed as foobar.

If a user searches either Foo.Bar or FooBar, the search analyzer will transform it to foobar, and a match will be found because the field was also indexed as foobar.

I think part of your issue is that you are trying to use full_match as your search analyzer, and partial_match_no_punctuation and partial_match as your index analyzer. Try and reconcile them into one (remove punctuation, ngrams), and use that for both your search and index analyzers. If you find that you still need multiple analyzers, then look into the multi field type that I mentioned above.

Hope that helps.

EDIT: Per your update, the issue with your multi field mapping is that you are trying to assign the same name to both fields. Additionally, you are naming the field "project", which is the name of your type, probably you want to name it "summary" instead. Also, you don't want to include your Id field as part of your Summary multi field. Try this instead:

.AddMapping<ElasticSearchProject>(m => m
.MapFromAttributes()
.Properties(project => project
    .MultiField(mf => mf
        .Name(o => o.Summary)
        .Fields(f => f
            .String(s => s.Name(o => o.Summary).Analyzer("partial_match"))
            .String(s => s.Name(o => o.Summary.Suffix("no_punctuation")).Analyzer("partial_match_no_punctuation"))
        )))));

This will create two fields in your mapping:

summary with the partial_match analyzer.

summary.no_puncuation with the partial_match_no_punctuation analyzer.

Community
  • 1
  • 1
Greg Marzouka
  • 3,315
  • 1
  • 20
  • 17
  • I must use different analyzers for search and index because I'll be querying several types at once. Including the period will help relevance on other fields. As for MultiField mapping, I'm updating my post now. I'm following the post you linked, but mine still only recognizes one analyzer. – Brandon Jun 09 '14 at 17:12
  • One thing that might fix this, you used "Suffix:raw". Is that what helps ElasticSearch know that it's the same field, but with a different name? – Brandon Jun 09 '14 at 17:21
  • Yea, you are trying to apply the same name to both fields. See my updated answer regarding your mapping. The Suffix method just appends whatever text you pass to it to the end of your field. – Greg Marzouka Jun 09 '14 at 17:47