I configured prometheus alert manager, but he is not alerting when the CPU of one of my server goes to 99% of usage. This is the alert :
- alert: HostHighCpuLoad
expr: avg(irate(node_cpu_seconds_total{mode="idle"}[1m]) * 100) < 30
for: 1m
labels:
severity: warning
annotations:
summary: "High usage on {{ $labels.instance }}"
description: "{{ $labels.instance }} has a average CPU idle (current value: {{ $value }}s)"
It looks like my expression, take the global average of all my servers, but i need to monitor this measure for every single server.
Someone already got this problem ?