I am looking for a query that will return the value which is missing in the documents from the given list of values. For example, there is a country field in the documents with values USA, Dubai, Singapore, Japan. Now I want to tell elastic search that I am giving you the list of countries(USA, Dubai, Russia), you give me the output which tells Russia is not part of any document. Is this possible?
1 Answers
You need to do a query like the one below that will only select documents with USA, Dubai and Russia and then aggregate the country
values.
{
"size": 0,
"query": {
"terms": {
"country": [
"USA",
"Dubai",
"Russia"
]
}
},
"aggs": {
"countries": {
"terms": {
"field": "country"
}
}
}
}
In the results, you're going to get buckets for all countries that are present (i.e. USA and Dubai) and no bucket for Russia.
You can then do a simple set arithmetics by subtracting the input array with the one you got from the aggregation results and you'll find what you need, i.e.:
[USA, Dubai, Russia] - [USA, Dubai] = [Russia]
UPDATE: If you want to do all the above in a single country you can leverage the scripted_metric
aggregation.
map_script
is going to run for each document on a shard and store all present countries in the temporary variable state.countries
.
reduce_script
is going to run on the coordinating node and receives the results of all shards. That script is simply comparing which countries in the params.countries
array are present and is going to only output the countries that are not present.
POST country/_search
{
"size": 0,
"query": {
"terms": {
"country": [
"USA",
"Dubai",
"Russia"
]
}
},
"aggs": {
"missing_countries": {
"scripted_metric": {
"init_script": "state.countries = [:]",
"map_script": """
def country = doc['country.keyword'].value;
if (!state.countries.containsKey(country)) {
state.countries[country] = 0;
}
state.countries[country]++;
""",
"combine_script": """
return state.countries;
""",
"reduce_script": """
// gather all present countries
def countries = new HashSet();
for (state in states) {
countries.addAll(state.keySet());
}
// figure out which country in params is not present in countries
def missing = [];
for (country in params.countries) {
if (!countries.contains(country)) {
missing.add(country);
}
}
return missing;
""",
"params": {
"countries": ["USA", "Dubai", "Russia"]
}
}
}
}
}
In this case, the output is going to be
"aggregations" : {
"missing_countries" : {
"value" : [
"Russia"
]
}
}

- 207,596
- 13
- 358
- 360
-
How do I do the set arithmetic? all I have is kibana query dashboard only. Therefore all I have to achieve using a single query only. Sorry, I am completely new to the Elasticsearch world. – vinay Nov 07 '20 at 09:51
-
it is giving syntax error at "map_script": """ . Says expected ' , ' instead of ' " ' – vinay Nov 10 '20 at 10:48
-
Where are you running this ? In Kibana Dev Tools? – Val Nov 10 '20 at 10:48
-
If not, you need to replace `"""` by `"` and inline the scripts on a single line between double quotes. Only Kibana "understands" the triple quotes `"""` – Val Nov 10 '20 at 10:49
-
I am using the extraction query editor inside kibana alerts. Not even " helping. Let me try some more ways. – vinay Nov 10 '20 at 11:52
-
I'm not sure what the "extraction query editor inside kibana alerts" is – Val Nov 10 '20 at 11:56
-
There is an alerting dashboard inside open distro kibana. Where you can create monitors and alerts to get notified for any abnormalities. – vinay Nov 10 '20 at 12:13
-
Ok, then it's definitely not a query you can reuse there I'm afraid... Your question didn't really mention that point – Val Nov 10 '20 at 12:13