I'm monitoring containers CPU usage with cAdvisor using the following expression in prometheus:
(sum(rate(container_cpu_usage_seconds_total[3m])) BY (instance, name) * 100) > 80
This alert is firing constantly for one of my containers as it's in fact using over 80% of CPU but on a single core only. My host has multiple cores and I would like to divide this percentage over the number of cores. I can see that cAdvisor is exporting a metric called machine_cpu_cores
which I thought would help me but unfortunately, I can't get it to work. I've tried:
(sum(rate(container_cpu_usage_seconds_total[3m])) BY (instance, name) / sum(machine_cpu_cores) * 100) > 0
Unfortunately, it is returning an empty query result. Also, I don't have any limits set up on containers. What am I doing wrong here?