The default mappings provided by ES which map a field as both text
and keyword
usually do that because it's convenient and that will allow the field to be used in different contexts without having to think too hard about it. It's also a good way of bootstrapping new projects and not worry too much about that aspect until later in the project.
However, if you're truly serious about your mappings and the performance of your cluster, you should always give as much thought as possible as to why you map a field in certain way.
There are a few basic rules (but your mileage may always vary) in the following (non-exhaustive) list:
- IDs, codes, keys, etc, that you usually use in exact searches can be mapped as
keyword
only (and/or wildcard
depending on your search use cases).
- If you have longer pieces of text closer to natural language that you might want to run full-text searches on, it's usually a good idea to map them as
text
.
- The corollary to the previous rule is that if you know that you'll never want to run full text searches on some field, don't map it as
text
as there is a non-negligible overhead related to indexing text fields during the analysis process.
- ...
As said, obviously the above list is non-exhaustive, but it gives you some pointers. The bottom line is that you need to think hard about your data and what you want to do with it. Once you know the use cases you need to support, you'll know how to map your fields. I would never accept to let a default text/keyword mapping if there's no reason to do it.