3

I am trying to rewrite this specific query to C# NEST, but Im stuck on defining filters... Im confused...

{  
   "settings":{  
      "analysis":{  
         "filter":{  
            "lemmagen_filter_sk":{  
               "type":"lemmagen",
               "lexicon":"sk"
            },
            "synonym_filter":{  
               "type":"synonym",
               "synonyms_path":"synonyms/sk_SK.txt",
               "ignore_case":true
            },
            "stopwords_SK":{  
               "type":"stop",
               "stopwords_path":"stop-­‐words/stop­‐words-­slovak.txt",
               "ignore_case":true
            }
         },
        "analyzer":{  
            "slovencina_synonym":{  
               "type":"custom",
               "tokenizer":"standard",
               "filter":[  
                  "stopwords_SK",
                  "lemmagen_filter_sk",
                  "lowercase",
                  "stopwords_SK",
                  "synonym_filter",
                  "asciifolding"
               ]
            },
            "slovencina":{  
               "type":"custom",
               "tokenizer":"standard",
               "filter":[  
                  "stopwords_SK",
                  "lemmagen_filter_sk",
                  "lowercase",
                  "stopwords_SK",
                  "asciifolding"
               ]
            },

I expect to have right client.CreateIndex(...) command with right index settings. All I have now is this:

client.CreateIndex(indexName, c => c
    .InitializeUsing(indexConfig)
    .Mappings(m => m
        .Map<T>(mp => mp.AutoMap())));

I cannot find any informations how to do this. I will be gratefull for any kind of help.

EDIT:

client.CreateIndex(indexName, c => c
                .InitializeUsing(indexConfig)
                .Settings(s => s
                    .Analysis(a => a
                        .TokenFilters(t => t
                            .UserDefined("lemmagen_filter_sk",
                                new LemmagenTokenFilter { Lexicon = "sk" })
                            .Synonym("synonym_filter", ts => ts
                                .SynonymsPath("synonyms/sk_SK.txt")
                                .IgnoreCase(true))
                            .Stop("stopwords_sk", tst => tst
                                .StopWordsPath("stop-words/stop-words-slovak")
                                .IgnoreCase(true))
                         )
                         .Analyzers(aa => aa
                            .Custom("slovencina_synonym", acs => acs
                            .Tokenizer("standard")
                            .Filters("stopwords_SK", "lemmagen_filter_sk", "lowercase", "stopwords_SK", "synonym_filter", "asciifolding")
                            )
                            .Custom("slovencina", acs => acs
                            .Tokenizer("standard")
                            .Filters("stopwords_SK", "lemmagen_filter_sk", "lowercase", "stopwords_SK", "asciifolding")
                            )
                         )
                     )
                 )
                .Mappings(m => m
                    .Map<DealItem>(mp => mp.AutoMap()
                    .Properties(p => p
                        .Text(t => t
                            .Name(n => n.title_dealitem)
                            .Name(n => n.coupon_text1)
                            .Name(n => n.coupon_text2)
                            .Analyzer("slovencina_synonym")
            )
        ))));

This is what i have now, but im getting ERROR after trying to use one

POST dealitems/_analyze
{
  "analyzer": "slovencina",
  "text":     "Janko kúpil nové topánky"
}

ERROR:

{
  "error": {
    "root_cause": [
      {
        "type": "remote_transport_exception",
        "reason": "[myNode][127.0.0.1:9300][indices:admin/analyze[s]]"
      }
    ],
    "type": "illegal_argument_exception",
    "reason": "failed to find analyzer [slovencina]"
  },
  "status": 400
}

and GET _settings doesn't show any analyzers

RESULT: Problem was in missing files...wrong paths

lagugula
  • 63
  • 8
  • 1
    I think [this](https://www.elastic.co/guide/en/elasticsearch/client/net-api/5.x/writing-analyzers.html#_configuring_a_built_in_analyzer) should help you with your problem. If not ping me. – Rob Dec 05 '17 at 08:13
  • Hi, I was trying to rewrite it by that, but the problem is, that there is no Filter, just TokenFilter and CharFilter. I think the token filter is the thing to use, but there is no type Lemmagen and I dont know how to set that with lexicon argument too – lagugula Dec 05 '17 at 09:32
  • For `filter` part, use `TokenFilters`. [Here](https://github.com/elastic/elasticsearch-net/blob/5.5/src/Tests/Analysis/TokenFilters/TokenFilterUsageTests.cs) you will find example usage for ES 5.x – Rob Dec 05 '17 at 09:48
  • Thanks, I will give it a try. I think it helps, but imstill not sure, how to create my own type "lemmagen_filter_sk":{ "type":"lemmagen", "lexicon":"sk"}, because there is no type of Lemmagen. Im going to try something like this "class LemmagenFilter : ITokenFilter" and then i will create implementation of type = "lemmagen" and i will add string lexicon ="sk" – lagugula Dec 05 '17 at 10:53

1 Answers1

3

Indeed, there is no lemmagen token filter available out of the box in NEST. Hopefully, you can easily create your own:

public class LemmagenTokenFilter : ITokenFilter
{
    public string Version { get; set; }
    public string Type => "lemmagen";
    [JsonProperty("lexicon")]
    public string Lexicon { get; set; }
}


var response = elasticClient.CreateIndex(_defaultIndex,
    d => d.Settings(s => s
        .Analysis(a => a
            .TokenFilters(t => t.UserDefined("lemmagen_filter_sk",
                new LemmagenTokenFilter
                {
                    Lexicon = "sk"
                }))))
                ..
                );

Hope that helps.

Rob
  • 9,664
  • 3
  • 41
  • 43
  • That is what I meant, thank you very much, I will try – lagugula Dec 05 '17 at 10:55
  • I have edited question, now im doing something wrong :( – lagugula Dec 05 '17 at 12:30
  • @lagugula can you try to delete the index and create one more time? – Rob Dec 05 '17 at 13:14
  • problem was in missing files in paths, i tried to create index with analyzers manualy via Kibana console... Api seems to not show these kind of errors – lagugula Dec 05 '17 at 13:23
  • 1
    you can investigate issues through `response.IsValid` and `response.ServerError`. Glad it worked – Rob Dec 05 '17 at 13:28
  • i looked to these, but response was always valid. Thank you very much for help :) – lagugula Dec 05 '17 at 15:49
  • It looks like my mapping does not work, analyzer alone seems fine Mappings(m => m .Map(mp => mp .AutoMap() .Properties(p => p .Text(t => t .Name(n => n.title_dealitem) .Name(n => n.coupon_text1) .Name(n => n.coupon_text2) .Analyzer("slovencina_synonym") .SearchAnalyzer("slovencina_synonym") – lagugula Dec 05 '17 at 15:59
  • Sorry for writing so late, I found dumb mistake which caused these problems. Tokenizer was semi-working because im missing cappitals in - .Stop("stopwords_sk", -and mapping as "stopwords_SK". Thanks for all help. – lagugula Dec 06 '17 at 13:26