2

I'm working with Prometheus alerts, and I would like to dynamically add a 'team' label to all of my alerts based on a regex pattern. I have an example alert:

expr: label_replace(label_replace(increase(kube_pod_container_status_restarts_total{job="kube-state-metrics",namespace=~".*",pod!~"app-test-.*"}[30m]) > 2, "team", "data", "container", ".*test.*"), "team", "data", "pod", ".*test.*")

This example alert adds the 'team' label with the value 'data' for metrics matching the regex pattern ".test." in the 'container' and 'pod' labels.

However, I want to apply this logic to all of my alerts, not just this specific one. Is there a way to do this dynamically in Prometheus or Alertmanager? Any guidance would be appreciated.

I tried using the label_replace function in the expression of the alert, and it worked as expected for the specific alert mentioned above. I was expecting to find a way to apply this label addition to all of my alerts without having to modify each alert expression individually.

Is there a way to achieve this? Any help or guidance would be greatly appreciated.

TomerA
  • 23
  • 5
  • Are you using this to later route those alerts based on introduced labels? – markalex Apr 09 '23 at 08:10
  • Yes, I'm using the introduced labels to route alerts to the appropriate teams based on the team tag. I want to ensure that the correct team is notified when a specific alert is triggered. I'm looking for a solution that allows me to dynamically add the team tag to all alerts using regex. – TomerA Apr 09 '23 at 08:11
  • My plan is to use the labels with PagerDuty to help route the alerts to the appropriate teams. I would like to dynamically add the 'team' label using regex so that alerts can be correctly routed to the respective teams based on the label. The main goal is to have the 'team' label added to all alerts to facilitate this process. – TomerA Apr 09 '23 at 08:13
  • Well, in that case you shouldn't add labels, but use matching capabilities of alertmanager to route alerts to appropriate channel (in your case - team). – markalex Apr 09 '23 at 08:16
  • @markalex Thank you for your suggestion. I understand that Alertmanager can be used to route alerts based on matching capabilities. In our case, we want to send the alerts from Alertmanager to PagerDuty, and then route them within PagerDuty based on the 'team' label. Is there a way to dynamically add the 'team' label to all alerts using regex before sending them to PagerDuty, or would you recommend another approach? – TomerA Apr 09 '23 at 08:21
  • I'm not sure about routing in PagerDuty, but if it's capable of routing based on labels (which is implied by your comment) then you can use same approach as in my answer, but in PG. – markalex Apr 09 '23 at 08:41

1 Answers1

1

AFAIK, there is no possibility to add labels to your alerts based on condition without rewriting all rules.

Best solution for your exact question is to create separate alerts for all environments/teams/conditions and just add static labels.

Something along the lines of

  - alert: many_restarts_data
    expr: increase(kube_pod_container_status_restarts_total{job="kube-state-metrics",namespace=~".*",pod!~"app-test-.*", container=~".*test.*"}[30m]) > 2
    labels:
      team: data
    
  - alert: many_restarts_data
    expr: increase(kube_pod_container_status_restarts_total{job="kube-state-metrics",namespace=~".*",pod!~"app-test-.*", container=~".*prod.*"}[30m]) > 2
    labels:
      team: sre

But it will require multiplying number of alerts by number of teams.

I would argue way easier solution is to use routing capabilities of alertmanager (or PagerDuty if it provides similar functionality). This way you write criteria which alerts with which labels should be routed to which teams, at alertmanager configuration, and it works independently from alerts creation part.

    routes:
    - matchers:
        - container =~ ".*test.*"
        - severity =~ ".*test.*"
        - alertname =~ "my_alert_1|my_alert_2"
      receiver: team-data

    - matchers:
        - container =~ ".*prod.*"
        - severity =~ ".*prod.*"
        - alertname =~ "my_alert_1|my_alert_2"
      receiver: team-sre
markalex
  • 8,623
  • 2
  • 7
  • 32
  • Thanks for your suggestions. I'm curious if there's a way to add the team label at the Kubernetes level, so it's already associated with relevant metrics before processing by Alertmanager or PagerDuty. Can we include the team label within Kubernetes objects like pods or containers, so it's automatically included in metrics? This could help avoid duplicating alerts or relying on routing rules. Any insights are appreciated. – TomerA Apr 09 '23 at 08:44
  • 1
    I'm not aware of such possibility through Kubernetes, but you can utilize [metrics relabeling](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config) to add corresponding label to all (or any) metric. Or maybe you could split your targets into different jobs and use just job selector. @TomerA – markalex Apr 09 '23 at 08:57