6

We have prometheus running on Win Server box, and WMI exporter on a separate box(client). Able to read client metrics in Prometheus. Now the requirement is the moment Diskspace =>90 % , send an email alert, so that we can run a job to clean up space using an automated job / manual job.

Could you please help on how to configure alert for diskspace >90

enter image description here

  • Are you asking how to set up alerts in general, or do you just need suggestions for what query to use as the basis for the alerting rule? – wbh1 Oct 17 '18 at 13:01

4 Answers4

5

You might want to alert based on if it's going to fill up, not based on how full it is:

- name: node.rules
  rules:
  - alert: DiskWillFillIn4Hours
    expr: predict_linear(node_filesystem_free{job="node"}[1h], 4 * 3600) < 0
    for: 5m
    labels:
      severity: page

https://www.robustperception.io/reduce-noise-from-disk-space-alerts

ptman
  • 787
  • 13
  • 19
  • 1
    Use of `for: 5m` good practice as this should avoid false positives where disk usage spikes but levels off i.e. backup clear down / creation – Brom558 Aug 25 '20 at 09:05
3

assuming you are using https://github.com/martinlindhe/wmi_exporter/blob/master/docs/collector.logical_disk.md you could use something along these lines for > 90 % use

  - alert: DiskSpaceUsage
expr: 100.0 - 100 * (wmi_logical_disk_free_bytes / wmi_logical_disk_size_bytes) > 90
for: 10m
labels:
  severity: high
annotations:
  summary: "Disk Space Usage (instance {{ $labels.instance }})"
  description: "Disk Space on Drive is used more than 90%\n  VALUE = {{ $value }}\n  LABELS: {{ $labels }}"

there are other examples on wmi_exporter repo for default node_exporter metrics ( not sure if available with windows ) it should be

- alert: DiskSpace10%Free
     expr: 100 - (100 * node_filesystem_avail_bytes / node_filesystem_size_bytes) > 90
     labels:
       severity: moderate
     annotations:
       summary: "Instance {{ $labels.instance }} is low on disk space"
       description: "diskspace on {{ $labels.instance }} is used over {{ $value }}% ."
´´´
Karsten
  • 79
  • 8
2

To send email notification based on alert you need to setup alertmanager with prometheus. Here is the guide how to do that: https://github.com/prometheus/alertmanager

Also you can configure the alert rules. I am using node exporter to fetch node metrics and using the following rule

- alert: DiskSpace10%Free
     expr: node_exporter:node_filesystem_free:fs_used_percents >= 90
     labels:
       severity: moderate
     annotations:
       summary: "Instance {{ $labels.instance }} is low on disk space"
       description: "{{ $labels.instance }} has only {{ $value }}% free."

You can configure the above rule according to WMI exporter and you will be good to go. Hope this helps.

mhansen
  • 1,102
  • 1
  • 12
  • 18
Prafull Ladha
  • 12,341
  • 2
  • 37
  • 58
  • 4
    Presumably you are using some sort of custom metric here `node_exporter:node_filesystem_free:fs_used_percents` is not a core metric. You give no explanation about that. Perhaps you could? – David Jan 31 '20 at 15:27
0

https://docs.leanxcale.com/leanxcale/1.5/installation_admin/monitoring/index.html#alerting-rules-recording-rules

   groups:
      - name: recording_rules
        interval: 5s
        rules:
          - record: node_exporter:node_filesystem_free:fs_used_percents
            expr: 100 - 100 * ( node_filesystem_free{mountpoint="/"} / node_filesystem_size{mountpoint="/"} )
    
      - name: alerting_rules
        rules:
          - alert: DiskSpace10%Free
            expr: node_exporter:node_filesystem_free:fs_used_percents >= 90
            # Note that previous expression evaluates the metric defined in the recording rule.
            labels:
              severity: moderate
            annotations:
              summary: "Instance {{ $labels.instance }} is low on disk space"
              description: "{{ $labels.instance }} has only {{ $value }}% free."
  • 1
    Your answer could be improved by adding more information on what the code does and how it helps the OP. – Tyler2P Apr 22 '22 at 13:47