I have the following ElasticSearch query:
{
"from": 0,
"sort": [
"_score"
],
"fields": [
"id",
"title",
"text"
],
"query": {
"query_string": {
"fields": [
"title",
"text"
],
"query": "(\"green socks\" OR \"red socks\") AND NOT (\"yellow\" OR \"blue\")"
}
},
"size": 100
}
This works fine, and returns a set of documents of around 80,000 documents.
I would like to calculate the following upon this set of 80,000 documents (i.e. the set of documents that matches "query": "(\"green socks\" OR \"red socks\") AND NOT (\"yellow\" OR \"blue\")")
:
- For each of "green socks" calculate the no. of documents within the 80,000 that contain "green socks" at least once.
- For each of "red socks" calculate the no. of documents within the 80,000 that contain "red socks" at least once.
- And so on, for all the other words/phrases that are in the "left-hand" side of the above query string.
- There are actually about 50 - 100 such words/phrases in each query string, so another such 50 - 100 "red socks" words/phrases in the query string I'm actually running.
This feels like an aggregation query, but I just can't see it.
Any help v gratefully received,
Thanks,
R