Running aggregations on case insensitive keywords elasticsearch 5.1

Question

I am trying to run aggregations on some keywords in my index but I want to lowercase all the keywords while indexing and searching but elastic 5.1 does not support normalizer. Also, I don't want to index them as text and enable fielddata. What are other options to accomplish this?

You know, fielddata is not that bad in all cases. If your nodes allows for extra memory usage from fielddata, why not? — Andrei Stefan, Apr 14 '17 at 16:01
@AndreiStefan I think elastic doesn't recommend using it that's why I am sceptic. — Ishank Gulati, Apr 14 '17 at 16:06
Correct, we don't recommend using it because most of the times it doesn't make sense to aggregate on analyzed fields. But this doesn't come from a bad design or a bug, for example, it's just an improvement to memory usage. But, if your memory usage is fine and you don't have other choice, use it. — Andrei Stefan, Apr 14 '17 at 16:09

score 0 · Answer 1 · answered Apr 14 '17 at 15:48

0

You could use an analyzer made of the keyword tokenizer and lowercase token filter.

PUT my_index
{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_keyword": { 
          "type":      "custom",
          "tokenizer": "keyword",
          "filter": ["lowercase"]
        }
      }
    }
  },
  "mappings": {
    "my_type": {
      "properties": {
        "my_field": {
          "type":     "text",
          "analyzer": "standard", 
          "fields": {
            "keyword": {
              "type":     "text",
              "analyzer": "my_keyword" 
            }
          }
        }
      }
    }
  }
}

answered Apr 14 '17 at 15:48

Val

207,596
13
358
360

We can't apply analyzer on keywords right? Also can't run aggregations on text. – Ishank Gulati Apr 14 '17 at 15:51
Ok, the other option is to lowercase your data before sending it to ES. Or upgrade to 5.3 – Val Apr 14 '17 at 15:59
just lowercase the data before sending it. I dont see the difficulty in this. you have to reindex anyway – Henley Apr 16 '17 at 14:16
For now I am using fielddata but will lowercase all these fields before deploying to prod. – Ishank Gulati Apr 16 '17 at 18:31

Running aggregations on case insensitive keywords elasticsearch 5.1

1 Answers1