2

I wrote a small bash script to use with nagios to check if nrpe is running.

The check works locally when run as root, but fails from the monitoring host.

From the host I'm trying to monitor, I have this line in my nrpe.conf:

 command[check_nrpe]=/usr/lib64/nagios/plugins/check_nrpe.sh

And made sure the check script is owned by the nagios user:

[root@ops:~] #ls -l /usr/lib64/nagios/plugins/check_nrpe.sh
-rwxr-xr-x. 1 **nagios nagios** 203 Jun  9 20:29     **/usr/lib64/nagios/plugins/check_nrpe.sh**

And if I run the script as the root user I get the correct result:

 [root@ops:~] #/usr/lib64/nagios/plugins/check_nrpe.sh OK: NRPE is running with pid: 24538
24538

But when I run it from the nagios host the check produces the opposite result:

[root@monitor1:~] #/usr/local/nagios/libexec/check_nrpe -H ops.mydomain.com -c     check_nrpe
**CRITICAL: NRPE is **NOT** Running**

If I go back to the host I'm trying to monitor and become the nagios user I get the same incorrect result as I do on the nagios host.

[root@ops:~] #su - nagios
Last login: Tue Jun  9 20:43:42 UTC 2015 on pts/3

-bash-4.2$ /usr/lib64/nagios/plugins/check_nrpe.sh
**CRITICAL: NRPE is **NOT** Running**

If I give the nagios user sudo access to that script, I can get the correct result as the nagios user on the local host.

In /etc/sudoers I give the nagios user access to the command and disabled tty by putting:

    nagios ALL=(ALL)    NOPASSWD: /usr/lib64/nagios/plugins/check_nrpe.sh    !requiretty

And now if I become the nagios user on the local host and use sudo the check produces the correct result.

[root@ops:~] #su - nagios
Last login: Tue Jun  9 23:37:09 UTC 2015 on pts/0

-bash-4.2$ sudo /usr/lib64/nagios/plugins/check_nrpe.sh
**OK: NRPE is running with pid: 24538**
24538

If I then edit my nrpe conf file on the local host to use sudo before command. In nrpe.conf I put:

[root@ops:~] #grep check_nrpe /etc/nagios/nrpe.cfg
command[check_nrpe]=/bin/sudo /usr/lib64/nagios/plugins/check_nrpe.sh

And restarted the nrpe service:

[root@ops:~] #systemctl restart nrpe
[root@ops:~] #lsof -i :5666
COMMAND  PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
nrpe    6137 nrpe    4u  IPv4 493404      0t0  TCP *:5666 (LISTEN)
nrpe    6137 nrpe    5u  IPv6 493405      0t0  TCP *:5666 (LISTEN)

But when I go back to the nagios host and run the check again, I get an output error:

[root@monitor1:~] #/usr/local/nagios/libexec/check_nrpe -H ops.jokefire.com -c     check_nrpe
 **NRPE: Unable to read output**

This is the contents of my check nrpe script:

[root@ops:~] #cat /usr/lib64/nagios/plugins/check_nrpe.sh
#!/bin/bash

pid=$(lsof -i :5666 | awk '{print $2}' | grep -i -v pid)

if [[ $pid ]]
then
  echo "OK: NRPE is running with pid: $pid"
  exit 0
else
  echo "CRITICAL: NRPE is **NOT** Running"
  exit 2
fi

HELP!! How do I get this check to return the correct result from the nagios host?

Thanks

user99201
  • 287
  • 2
  • 8
  • 22
  • check the path on sudo, or just use `sudo`, mine is `/usr/bin/sudo` – Grizly Jun 10 '15 at 02:20
  • 1
    I can't see anything obviously wrong with the script. Look in the nrpe log file on the monitored machine for errors. However isn't the check pointless? If nrpe isn't running then the nagios server won't be able to contact the client to run the check. – Paul Haldane Jun 10 '15 at 06:26
  • why would you do this? check_nrpe with no -c arg... is how you check if NRPE is running – Keith Jun 10 '15 at 15:43
  • This is probably the sudo requiretty setting. But the whole question is pointless anyway... – Keith Jun 10 '15 at 15:44

0 Answers0