I've built a bunch of munin plugins that monitor various back-end services. If the metrics dip below thresholds set in munin.conf
, we're notified by email. However, if one of these services goes down completely, the plugin will fail and nobody gets notified!
I've followed the module-writing guide and added an exit code and message:
sys.stderr.write('Error connecting to %s: %s\n' % (name, e))
sys.exit(2)
But this only appears in the log. Nobody is watching the log.
Is there a way to make Munin alert on a complete plugin failure?