4

I have a server running backups from various servers concurrently and get a lot of "Write IO Wait time" warnings. Is it possible to silence the warning of a single plugin on a single host?

I think it is for diskstats_latency.avgwrwait.

Dax
  • 294
  • 2
  • 11

1 Answers1

4

You don't say how you have munin reporting, which makes this difficult to answer. I'll assume you have it sending emails directly, with something like this in munin.conf:

contact.dax.command mail -s "Munin notification" dax@example.com

If this is so, the only way I know to silence a single alert is to tell munin that it's not a problem by raising the limits, with eg

[host.example.com]
    diskstats_latency.avgwrwait.warning    100000000000
    diskstats_latency.avgwrwait.critical   200000000000

The underlying problem is that munin is a great at quantitative monitoring, but rather poor at notification handling. It lacks controls to temporarily silence particular alerts, notify through certain channels only at certain times of day, schedule downtime periods, and/or escalate to higher-tier contacts if problems continue. Most people I know who run munin (including me) have it reporting into NAGIOS, which has a vastly more sophisticated notification engine that can do all of the above.

If you have this setup, you can acknowledge the error in NAGIOS (silence notifications until the next time it returns to normal), or have your backup script schedule a period of downtime for the service that lasts about the length of the backups, or even have the script start by disabling notifications for that service, and re-enabling them as it finishes.

MadHatter
  • 79,770
  • 20
  • 184
  • 232