0

I configured prometheus alert manager, but he is not alerting when the CPU of one of my server goes to 99% of usage. This is the alert :

- alert: HostHighCpuLoad
  expr: avg(irate(node_cpu_seconds_total{mode="idle"}[1m]) * 100) < 30
  for: 1m
  labels:
    severity: warning
  annotations:
    summary: "High usage on {{ $labels.instance }}"
    description: "{{ $labels.instance }} has a average CPU idle (current value: {{ $value }}s)"

It looks like my expression, take the global average of all my servers, but i need to monitor this measure for every single server.

Someone already got this problem ?

1 Answers1

0
avg by (instance) (irate(node_cpu_seconds_total{mode="idle"}[1m]) * 100) < 30
bjoster
  • 4,805
  • 5
  • 25
  • 33