0

I want to create an alert in Kibana using an Elastic query. I'm using the opendistro alerting feature. I want to check all of the values of the cpu.pct field in the last 10 minutes is greater than 50 and raise an alert if yes.

{
"size": 500,
"query": {
    "bool": {
        "filter": [
            {
                "match_all": {
                    "boost": 1
                }
            },
            {
                "match_phrase": {
                    "client.id": {
                        "query": "42",
                        "slop": 0,
                        "zero_terms_query": "NONE",
                        "boost": 1
                    }
                }
            },
            {
                "range": {
                    "cpu.pct": {
                        "from": 10,
                        "to": null,
                        "include_lower": true,
                        "include_upper": true,
                        "boost": 1
                    }
                }
            },
            {
                "range": {
                    "@timestamp": {
                        "from": "{{period_end}}||-5m",
                        "to": "{{period_end}}",
                        "include_lower": true,
                        "include_upper": true,
                        "format": "epoch_millis",
                        "boost": 1
                    }
                }
            }
        ],
        "adjust_pure_negative": true,
        "boost": 1
    }
},
"aggregations": {
    "2": {
        "terms": {
            "field": "client.name.keyword",
            "size": 5,
            "min_doc_count": 1,
            "shard_min_doc_count": 0,
            "show_term_doc_count_error": false,
            "order": {
                "_key": "desc"
            }
        },
        "aggregations": {
            "3": {
                "terms": {
                    "field": "component.name",
                    "size": 1000,
                    "min_doc_count": 1,
                    "shard_min_doc_count": 0,
                    "show_term_doc_count_error": false,
                    "order": [
                        {
                            "1": "desc"
                        },
                        {
                            "_key": "asc"
                        }
                    ]
                },
                "aggregations": {
                    "1": {
                        "avg": {
                            "field": "cpu.pct"
                        }
                    }
                }
            }
        }
    }
}

I have the following query which calculates the average but that's incorrect.

Negative Case : Values (100, 100, 100, 100, 100, 100, 0, 0, 0, 0) | Alert Raised : No (Avg : 60)

Positive Case : Values (60, 60, 60, 60, 60, 60, 60, 60, 60, 60) | Alert Raised : Yes (Avg : 60)

How can I can check against all values?

amat_coder
  • 31
  • 4

1 Answers1

0

I'm not sure of what application are you using for triggering alert. One way to solve your case is by having two filter aggregation:

  1. totalInLast10Min : This is to get the total docs indexed in last 10 mins.
  2. totalInLast10MinAboveTh : This is to get the total docs indexed in last 10 mins and have value of a field above threshold.

If totalInLast10Min == totalInLast10MinAboveTh then trigger alert.

Eg.

Create index

PUT test
{
  "mappings": {
    "properties": {
      "timestamp": {
        "type": "date",
        "format": "yyyy-MM-dd HH:mm:ss"
      }
    }
  }
}

Add some docs

POST test/_doc
{"cpu":20,"timestamp":"2020-08-18 20:20:00"}

POST test/_doc
{"cpu":100,"timestamp":"2020-08-18 20:21:00"}

POST test/_doc
{"cpu":90,"timestamp":"2020-08-18 20:29:00"}

Query:

GET test/_search
{
  "size": 0,
  "aggs": {
    "totalInLast10Min": {
      "filter": {
        "range": {
          "timestamp": {
            "gte": "2020-08-18 20:20:00"
          }
        }
      }
    },
    "totalInLast10MinAboveTh": {
      "filter": {
        "bool": {
          "must": [
            {
              "range": {
                "timestamp": {
                  "gte": "2020-08-18 20:20:00"
                }
              }
            },
            {
              "range": {
                "cpu": {
                  "gte": 80
                }
              }
            }
          ]
        }
      }
    }
  }
}

Sample result:

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "totalInLast10MinAboveTh" : {
      "meta" : { },
      "doc_count" : 2
    },
    "totalInLast10Min" : {
      "meta" : { },
      "doc_count" : 3
    }
  }
}

Based on the count of the two aggs you can write condition on when to trigger alert.

Nishant
  • 7,504
  • 1
  • 21
  • 34