ElasticSearch with hunspell analyzer

Question

I'd like to create an index in ElasticSearch which stores a specific type of data which has some string fields. The language is hungarian.

I ran a HTTP PUT command with the following body:

{
    "settings" : {  
        "analysis" : {
            "analyzer" : {
                "hu" : {
                    "tokenizer" : "standard",
                    "filter" : [ "lowercase", "hu_HU" ]         
                }
            },
            "filter" : {
                "hu_HU" : {
                    "type" : "hunspell",
                    "locale" : "hu_HU",
                    "language" : "hu_HU"
                }
            }       
        }
    },
    "mappings": {
        "printedArticle": {
            "_source": {"enabled": false},
            "properties": {
                "_id": {"type": "string", "store": true},
                "mysqlid": {"type": "long", "store": false},
                "publishDate": {"type": "date", "format": "dateOptionalTime", "store": false},
                "title": {"type": "string", "analyzer": "hu", "analyze": true, "store": false},
                "lead": {"type": "string", "analyzer": "hu", "analyze": true, "store": false},
                "content": {"type": "string", "analyzer": "hu", "analyze": true, "store": false},
                "participants": {"type": "string", "analyzer": "hu", "analyze": true, "store": false},
                "authors": {"type": "string", "analyzer": "hu", "analyze": true, "store": false},
                "subtitle": {"type": "string", "analyzer": "hu", "analyze": true, "store": false}
            }
        }
    }   
}

Then I inserted one record with some test text, and if I run a search through Elastic API with a GET request like this:

http://localhost:9200/mf_pa/_search?q=MYTESTTEXT

it founds my record only if my test text is equal with one of the words of my record.

I tried to analyze some similar text through the analysis API:

http://localhost:9200/mf_pa/_analyze?analyzer=hu&text=My text to tokenize

and it tokenized my test text properly. Based on this fact I'd expect that if I put a previously found token into my search query, it would find the record but it's not.

For an english example I'd say that my text is 'unforgettable' and my query is 'forget'. What should I do to find the record?

score 0 · Accepted Answer · answered Sep 23 '15 at 15:48

0

If the analyzer tests out using the Analyze API, it should also work in the mapping. Here are some things to check:

Make sure the mapping was input successfully. GET /mf_pa/_mapping

For example, "analyze": true should be "index": "analyzed"
Make sure that the test document was actually correctly indexed as type printedArticle.

GET /mf_pa/_search should return your test doc showing "_type": "printedArticle".
You can also use the Analyze API to validate how text will analyze against a specific field (to ensure the analyzer is correctly applied to that field)

e.g. GET /mf_pa/_analyze/?field=title&text=A kőszivű ember fiai

answered Sep 23 '15 at 15:48

Peter Dixon-Moses

3,169
14
18

Thanks for your reply! First I noticed that my test data was not inserted correctly (I inserted into printedArticles instead of printedArticle). I also checked the mapping and changed `"analyze":true` to `"index": "analyzed"`. I also checked the analyzer and it turned out that it analyzes my text correctly. I ran a search query where I specified the field: `GET /mf_pa/_search?field=title&text=MYTESTTEXT` and it was successful. So I guess my question is how to search in all fields? – maestro Sep 24 '15 at 08:42
It also turned out that if I specifiy the `text` parameter for the search e.g. `/mf_pa/_search?text=MYTESTTEXT` it gives back the correct result so my search query was wrong. Thanks for your reply, you pointed that my mapping was wrong and I used the search API in the wrong way... – maestro Sep 24 '15 at 08:46
For some strange reason on ElasticSearch web page all search queries are shown as `GET` instead of `POST` ... if I run a search as a `POST` it works. – maestro Sep 24 '15 at 13:09
Found POST as solution based on this question: [http://stackoverflow.com/questions/26502397/querying-elasticsearch-returns-all-documents] – maestro Sep 24 '15 at 13:16

ElasticSearch with hunspell analyzer

1 Answers1