I'm trying to work with Elastic (5.6) and to find a way to retrieve the top documents per some category.
I have an index with the following kind of documents :
{
"@timestamp": "2018-03-22T00:31:00.004+01:00",
"statusInfo": {
"status": "OFFLINE",
"timestamp": 1521675034892
},
"name": "myServiceName",
"id": "xxxx",
"type": "Http",
"key": "key1",
"httpStatusCode": 200
}
}
What i'm trying to do with these, is retrieve the last document (@timestamp
-based) per name
(my categories), see if its statusInfo.status is OFFLINE
or UP
and fetch these results into the hits part of a response so I can put it in a Kibana count dashboard or somewhere else (a REST based tool I do not control and can't modify by myself).
Basically, I want to know how many of my services (name
) are OFFLINE (statusInfo.status
) in their last update (@timestamp
) for monitoring purposes.
I'm stuck at the "Get how many of my services" part.
My query so far:
GET actuator/_search
{
"size": 0,
"aggs": {
"name_agg": {
"terms": {
"field": "name.raw",
"size": 1000
},
"aggs": {
"last_document": {
"top_hits": {
"_source": ["@timestamp", "name", "statusInfo.status"],
"size": 1,
"sort": [
{
"@timestamp": {
"order": "desc"
}
}
]
}
}
}
}
},
"post_filter": {
"bool": {
"must_not": {
"term": {
"statusInfo.status.raw": "UP"
}
}
}
}
}
This provides the following response:
{
"all_the_meta":{...},
"hits": {
"total": 1234,
"max_score": 0,
"hits": []
},
"aggregations": {
"name_agg": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "myCategory1",
"doc_count": 225,
"last_document": {
"hits": {
"total": 225,
"max_score": null,
"hits": [
{
"_index": "myIndex",
"_type": "Http",
"_id": "dummy id",
"_score": null,
"_source": {
"@timestamp": "2018-04-06T00:06:00.005+02:00",
"statusInfo": {
"status": "UP"
},
"name": "myCategory1"
},
"sort": [
1522965960005
]
}
]
}
}
},
{other_buckets...}
]
}
}
}
Removing the size make the result contain ALL of the documents, which is not what I need, I only need each bucket content (every one contains one bucket). Removing the post filter does not appear to do much.
I think this would be feasible in ORACLE
SQL with a PARTITION BY OVER
clause, followed by a condition.
Does somebody know how this could be achieved ?