Elasticsearch only finding hits with ".keyword" appended
I'm having a terrible time with querying an elasticsearch 5 instance full of fluentd log entries that I imported from an older elasticsearch instance running version 1.7. Queries through Kibana for the simplest things frequently time out, and I'm completely in the dark for where to look to investigate potential performance issues. A sampling of the elasticsearch mappings for the index I'm querying looks like this:
=> {"@log_name"=>{"type"=>"text", "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}},
"@timestamp"=>{"type"=>"date"},
"@version"=>{"type"=>"text", "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}},
"action"=>{"type"=>"text", "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}},
"api"=>{"type"=>"boolean"},
"controller"=>{"type"=>"text", "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}},
"db"=>{"type"=>"float"},
"duration"=>{"type"=>"float"},
"error"=>{"type"=>"text", "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}},
"filtered_params"=>{"type"=>"text", "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}},
"user"=>
{"properties"=>
{"email"=>{"type"=>"text", "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}},
"snowflake_id"=>{"type"=>"text", "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}},
"snowflake_uid"=>{"type"=>"text", "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}},
"type"=>{"type"=>"text", "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}}}},
...
With that in place, I can query the index by using curl
with something like to return the total number of documents found:
curl -s -XGET 'localhost:9200/logstash-2017.08.15/_search?pretty' -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"should": [
{
"term": {
"user.email": "user@example.com"
}
}
]
}
}
}
' | jq ".hits.total | length"
0
Meaning that 0 documents were found. However, if I replace the user.email
term with user.email.keyword
, the query returns a total
number of 40.
I guess my main question is: How do I know if my mappings are correct for this data? (For the imported data, they were created as the data was inserted at insert time, and I'm assuming that going forward, they are created automatically)