match_phrase doesn't work on analyzed index

Question

I did read this post but didn't quite understand how to apply it.

I have my index analyzed and I wanted to know if I can search for phrases without inflectional forms.

es.indices.create(index='bhar',body = {
 "settings":{
  "analysis":{
   "analyzer":{
    "my_stop_analyzer":{
     "type":"custom",
     "tokenizer":"standard",
     "filter":        ["english_possessive_stemmer","lowercase","english_stop","english_stemmer"]
              }
             },"filter":{
               "english_stop":{
                    "type":"stop",
                    "stopwords":"_english_"
                               },
                                 "english_stemmer":{
                                   "type":"stemmer",
                                   "name":"english"
                                    },
                                 "english_possessive_stemmer": {
                                   "type": "stemmer",
                                   "language": "possessive_english"
                                    }}}},
                          "mappings":{
                            "my_type":{
                               "properties":{
                                  "test": {
                                    "type":"text",
                                    "analyzer" : "my_stop_analyzer"
                                     }
                                     }
                                     }
                                     }
                                     })

`

And one record has data that is " picks towering". When I search for "pick tower", it still gives me a result with match_phrase which I wrote like this:

scroll = elasticsearch.helpers.scan(es, query={
    "query":{
    "match_phrase":{ 
    "test":{
    "query":"pick rule"
          }
                 }
          }
 },index='bhar', scroll='100s')`

Is there any way I can get only the exact match of the phrase? Thank you

score 1 · Answer 1 · answered May 10 '17 at 07:55

1

Your analyzer is changing the terms that are indexed and there is no way for it to keep the original text. My suggestion is to change your mapping slightly:

    "test": {
      "type": "text",
      "analyzer": "my_stop_analyzer",
      "fields": {
        "keyword": {
          "type": "keyword"
        }
      }
    }

And introduce in your queries, also, the "exact matching" option:

"match_phrase": {
  "test.keyword": "picks towering"
}

or

  "query": {
    "bool": {
      "should": [
        {
          "match_phrase": {
            "test.keyword": "picks towering"
          }
        },
        {
          "match_phrase": {
            "test": "picks towering"
          }
        }
      ]
    }
  }

The second type of query above is meant not to interfere with whatever other queries you already have in there.

answered May 10 '17 at 07:55

Andrei Stefan

51,654
6
98
89

I also need to query the same field with inflectional forms in another script. If I change my mapping would that work? – sleepophile May 10 '17 at 10:07
The mapping I provided just adds something to that field: a sub-field and the query I added to the initial query uses the sub-field. Your "inflectional forms in another script" query can be added to my suggested `bool` query as another statement. – Andrei Stefan May 10 '17 at 10:14
Can I index the field twice? Once with the custom analyzer and again with standard analyzer? – sleepophile May 10 '17 at 10:24
`"match_phrase": { "test.keyword": "Picks Tower"}` This gave me no output – sleepophile May 10 '17 at 10:26
For what value in the document that query doesn't match? – Andrei Stefan May 10 '17 at 10:30
`test.keyword` should help me make exact text match right? Even though 'Picks Tower' exists in the document, I got no result. – sleepophile May 10 '17 at 10:33
If you use `"term": { "test.keyword": "Picks Tower"}` does it match? – Andrei Stefan May 10 '17 at 10:38
No. It still doesn't match – sleepophile May 10 '17 at 11:20
What is the exact value you put in your `test` field? – Andrei Stefan May 10 '17 at 11:23
'Picks Tower the extension' Just a random value – sleepophile May 10 '17 at 11:27
1

When you index a field with `keyword` analyzer, the field values don't get tokenized. So only 'Picks Tower the extension' will match the document. @bhargavi – Archit Saxena May 11 '17 at 05:52

match_phrase doesn't work on analyzed index

1 Answers1