0

New to Prometheus Alerting!

I have a prometheus Counter that has multiple Child metrics that keeps incrementing on specific individual conditions.

On a given day, for this expression, these counters will look like:

Expression: floor(sum by (app_kubernetes_io_name, kubernetes_namespace, failure, owner_team) (increase(failure_stats_total[24h]))) > 0

This is returning the sum of individual Child metrics for the past 24 hours.


{app_kubernetes_io_name="consumer", failures="APP_FAILED", kubernetes_namespace="dev", owner_team="Team C"}
32
{app_kubernetes_io_name="consumer", failures="APP_TRANSFER_FAILED", kubernetes_namespace="dev", owner_team="Team C"}
10
{app_kubernetes_io_name="consumer", failures="DEVICE_FAILED", kubernetes_namespace="dev", owner_team="Team C"}
30

My question here is how do I fire one single slack alert every 24 hours that gives the summary of all the failures occurred and the respective counts over the past day?

I'm not sure if group_by is the right choice here. Please advice

0 Answers0