I'm monitoring several hosts using Nagios. This works fine when I use "normal" checks that are executed on the monitoring host (say, check_http
). However, I'm having troubles with NRPE-based checks which are executed through the NRPE-service on the monitored host instead.
I have declared my custom commands in the NRPE-configuration of the monitored hosts, e.g.
command[check_memory]=/usr/lib/nagios/plugins/check_memory -w 20% -c 10% -u G
I've then created the corresponding Nagios commands in the Nagios configuration on the monitoring host:
define command {
command_name my_check_nrpe
command_line /usr/lib/nagios/plugins/check_nrpe -H '$HOSTALIAS$' -c '$ARG1$'
}
define service {
use my-service
service_description Free memory
check_command my_check_nrpe!check_memory
check_interval 15
}
These checks work fine when I run them manually on the monitoring host using the nagios
user (which the nagios
service runs under):
nagios@monitor:~$ /usr/lib/nagios/plugins/check_nrpe -H 'my.target.host' -c 'check_memory'
MEMORY OK - 0G free | free=956080128b;419844915.2:;209922457.6:
However, I continuously receive email warnings from Nagios about the service:
***** Nagios *****
Notification Type: PROBLEM
Service: Free memory
Host: my.target.host
Address: XXX.XXX.XXX.XXX
State: WARNING
Date/Time: $
Additional Info:
$
I haven't managed to get any more details about the warnings. The Nagios logs on the monitoring host only show that the warnings were sent:
[1500623961] SERVICE NOTIFICATION: my-mailbox;my.target.host;Free memory;WARNING;notify-by-email;(null)
[1500627561] SERVICE NOTIFICATION: my-mailbox;my.target.host;Free memory;WARNING;notify-by-email;(null)
I've also activated maximum debugging output for Nagios:
debug_level=-1
debug_verbosity=2
However, /var/lib/nagios3/nagios.debug
doesn't contain anything of interest:
[1500630464.420189] [064.1] [pid=21171] Making callbacks (type 9)...
[1500630464.420243] [064.1] [pid=21171] Making callbacks (type 9)...
[1500630464.420308] [064.1] [pid=21171] Making callbacks (type 9)...
[1500630464.420389] [064.1] [pid=21171] Making callbacks (type 9)...
[1500630464.421086] [064.1] [pid=21171] Making callbacks (type 7)...
[1500630464.421767] [064.1] [pid=21174] Making callbacks (type 9)...
Similarly, I've enabled debugging output for the NRPE service on the monitored hosts (debug=1
) but the NRPE logs only tell me that my check_memory
command has been added successfully.
I'm running NRPE 3.0.1-3 and Nagios 3.5.1.
How can I solve this issue or gather more information about the problem?