Configuring nagios notification settings to be very frequent

Question

I've set up a Proxmox VE Cluster with three nodes. Each nodes has a number of VMs running on it. I'm using the PVE Monitor Plugin to set up the hosts and services, which works fine.

My issue is that Nagios's email-sending behavior is somehow odd. Ideally, I would like to have a check once-per-minute, for both the nodes as well as all services that are running on each node.

My configuration file looks like this:

# Define the cluster itself as a host
# the command check_pve_cluster_nodes give us info
# on the member's cluster state
define host {
        host_name pve-cluster
        max_check_attempts 10
        check_command check_pve_cluster_nodes
    contact_groups admins
    check_interval 1
    contact_groups admins
    notifications_enabled 1
}

# define openvz, qemu and storages as services of the cluster
define service{
        use generic-service
        host_name pve-cluster
        service_description OpenVZ VMs
        check_command check_pve_cluster_openvz
    check_interval 1
    contact_groups admins
    notifications_enabled 1
}


define service{
        use generic-service
        host_name pve-cluster
        service_description Qemu VMs
        check_command check_pve_cluster_qemu
    check_interval 1
    contact_groups admins
    notifications_enabled 1
}


define service{
        use generic-service
        host_name pve-cluster
        service_description Storages
        check_command check_pve_cluster_storage
    check_interval 1
    contact_groups admins
    notifications_enabled 1
}

I haven't changed the time unit settings, so those should be once-per-minute checks. The Nagios Web UI is showing that a host is offline, but email notifications are sent only a couple of minutes later. Furthermore, the email content is missing the most important piece of information - which node/service exactly is in critical state:

Node down

***** Nagios *****

Notification Type: PROBLEM
Host: pve-cluster
State: DOWN
Address: pve-cluster
Info: NODES CRITICAL  2 / 3 working nodes

Date/Time: Fri Mar 6 10:48:25 CET 2015

VM down

***** Nagios *****

Notification Type: PROBLEM

Service: Qemu VMs
Host: pve-cluster
Address: pve-cluster
State: CRITICAL

Date/Time: Fri Mar 6 10:40:44 CET 2015

Additional Info:

QEMU CRITICAL 2 / 3 working VMs

How can I set up the configuration, so that hosts and services (i.e. VMs) are checked in a one-minute-interval? Ideally, re-checks for that status should be sent in 15-minute intervals after that.

Is this even the best workflow? Or is there another, better way to schedule notifications with acknowledging them?

Taz · Accepted Answer · 2015-03-06T10:22:59.560

Nagios only sends emails once a host or service has entered a 'hard' state. At a basic level to answer your question - a hard state is reached once the host or service has been checked a number of times specified by max_check_attempts. By default, this is 4.

Info on soft/hard states: http://nagios.sourceforge.net/docs/3_0/statetypes.html Info on max_check_attempts: http://nagios.sourceforge.net/docs/3_0/objectdefinitions.html

It looks like the plugin is definitely INTENDING to give return details, but for whatever reason it isn't. Unfortunately I don't have the environment to test this with so I might have to leave you hanging with that part of the question.

Relevant sections of the perl:

print "NODES $rstatus{$statusScore}  $workingNodes / " .
          scalar(@monitoredNodes) . " working nodes" . $br . $reportSummary;

print "STORAGE $rstatus{$statusScore} $workingStorages / " .
          scalar(@monitoredStorages) . " working storages" . $br . $reportSummary;

print "OPENVZ $rstatus{$statusScore} $workingVms / " .
          scalar(@monitoredOpenvz) . " working VMs" . $br . $reportSummary;

print "QEMU $rstatus{$statusScore} $workingVms / " .
          scalar(@monitoredQemus) . " working VMs" . $br .
          $reportSummary;

$reportSummary is populated with details of the problem sections higher in the code but doesn't seem to be being returned for you.

Configuring nagios notification settings to be very frequent

Node down

VM down

1 Answers1