1

After some trial and error I managed to get the check_hwinfo plugin only partially working. And by partially I mean manually.

In my '/usr/lib64/nagios/plugins' directory I have the 'check_nrpe_hwinfo.sh' script with the correct permissions:

[root@localhost plugins]# ls -lah | grep hwinfo
-rwxr-xr-x. 1 root root    419 Dec  8 15:35 check_nrpe_hwinfo.sh

In my 'conf.d' directory I have a 'check-hwinfo.cfg' file with the necessary declarations:

define command{
        command_name    check_hwinfo
        command_line    $USER1$/check_nrpe_hwinfo.sh $HOSTNAME$ $HOSTADDRESS$
}



define service{
        use                     generic-service
        hostgroup_name          1st-floor-windows-nrpe-hosts,2nd-floor-windows-nrpe-hosts
        service_description     HW Info
        notification_options    none
        normal_check_interval   240
        notification_interval   240
        retry_check_interval    2
        max_check_attempts      120
        check_command           check_hwinfo
}

On my windows hosts I have the supplied 'check_hwinfo.wsf' file in 'C:\NSClient++\scripts'. When doing double click, the script runs correctly and displays the info in a popup window. Also, I have modified the 'nsclient-full.ini' file like this:

[/settings/external scripts/scripts]
check_hwinfo=c:\windows\system32\cscript.exe //NoLogo //T:30 scripts\check_hwinfo.wsf
check_hwinfo_csv=c:\windows\system32\cscript.exe //NoLogo //T:30 scripts\check_hwinfo.wsf /sep:csv

When on my Nagios server, in the '/usr/lib64/nagios/plugins/' directory I give this command:

./check_nrpe -H 192.168.10.13 -c check_hwinfo

I get the correct output.

The check is supposed to run automatically. But... In the Nagios WebUI i get this error in the line corresponding to check_hwinfo:

(Return code of 126 is out of bounds - plugin may not be executable) 

After some experimentation with Nagios I think this is just a generic error.

So... Any ideas why the check executes and returns properly when run manually but not when run automatically?

UPDATE 1:

The 'check_nrpe_hwinfo.sh' file looks exactly like this:

#!/bin/bash

ARG_HOSTNAME=${NAGIOS_HOSTNAME:-$1}
ARG_HOSTADDRESS=${NAGIOS_HOSTADDRESS:-$2}

PATH=${PATH}:/usr/lib64/nagios/plugins

HWINFO="`check_nrpe -H $ARG_HOSTNAME -c check_hwinfo_csv`"
RESULT=$?
ARG_HOSTNAME_CLEAN=`echo $ARG_HOSTNAME | tr -cd '0-9a-zA-Z._-'`

if [ "$RESULT" == 0 ]; then
        echo "\"$ARG_HOSTADDRESS\",$HWINFO" > /var/www/html/hwinfo/$ARG_HOSTNAME_CLEAN
fi
echo "$HWINFO"
exit $RESULT

UPDATE 2:

[root@localhost plugins]# ./check_nrpe -H 192.168.10.13 -c check_hwinfo_csv
"Gigabyte Technology Co., Ltd.","P55A-UD3","","1","Intel(R) Core(TM) i7 CPU         870  @ 2.93GHz","2927 MHz","8192 KB","133 MHz","8192M","Non-ECC","4096M/2048M/2048M/0","932 G / 932 G","WDC WD10EALS-002BA0 ATA Device / WDC WD10EZRX-00A8LB0 ATA Device","Microsoft Windows 7 Ultimate "
dlyk1988
  • 1,674
  • 4
  • 24
  • 36
  • what does check_nrpe_hwinfo.sh look like? – Mike Dec 08 '14 at 15:04
  • @Mike Please see update. – dlyk1988 Dec 08 '14 at 15:07
  • 1
    I realize that the check is being performed on a Windows machine, but I've had issues with Nagios checks in the past related to permissions on various Linux binaries that result in similar errors, for example with /usr/bin/ping. At issue is the fact that when you run it manually, you're running as root. But the Nagios user runs under a different user which doesn't have the same permissions. – David W Dec 08 '14 at 15:07
  • You have a different `-c` parameter when you invoke `check_nrpe` via the `check_nrpe_hwinfo.sh` script than when you do the manual test. – MadHatter Dec 08 '14 at 15:08
  • @MadHatter Indeed I do. But then again, the difference is handled in the client side (windows machine), and the different command name is interpreted as an added argument. Also, when I manually run './check_nrpe -H 192.168.10.13 -c check_hwinfo_cvs', again I get correct output. – dlyk1988 Dec 08 '14 at 15:15
  • My point is that your argument ("*it runs fine manually but not automatically*") is faulty, because you have no idea whether it runs fine manually. *Something else* runs fine manually; so what? Before blaming this on the manual-vs.-automatic difference, make sure you're checking the **right thing** manually. Show us what happens when from the NAGIOS server you manually run *exactly* the same command that you are asking NAGIOS to run. You say above you get the same output, but the command you show above **still** isn't the same. – MadHatter Dec 08 '14 at 15:18
  • @MadHatter Point taken! Anyway, it is amended now, and I am sure (almost) that what runs manually does not run automatically. – dlyk1988 Dec 08 '14 at 15:21
  • @MadHatter Also, see new update. – dlyk1988 Dec 08 '14 at 15:24
  • 2
    dsljanus, you keep running commands that aren't what you ask the NAGIOS server to do, and assuming that the difference is unimportant. Please, on the server **manually run the command that you're asking NAGIOS to run**: `check_nrpe_hwinfo.sh $HOSTNAME$ $HOSTADDRESS$`, substituting appropriate values for the variables, **and show us the output of that**. – MadHatter Dec 08 '14 at 15:26
  • What @DavidW said. If the nagios user doesn't have permission to execute the script on the linux side, you'll get a generic error. Double-check the permissions of the script on the linux side. Also check /var/log/messages and the nagios log for any useful troubleshooting information. – Katherine Villyard Dec 08 '14 at 15:26
  • What does the output of `ls -l` look like on check_nrpe_hwinfo.sh? – Katherine Villyard Dec 08 '14 at 15:30
  • 1
    @KatherineVillyard I have posted this, but here is is: -rwxr-xr-x. 1 root root 419 Dec 8 15:35 check_nrpe_hwinfo.sh – dlyk1988 Dec 08 '14 at 15:32
  • My bad. Can you make the permissions match any other plugins you're running? (I'm pretty sure nagios isn't running as root.) I'm pretty sure that either that specific script or something it calls is something that nagios can't run. – Katherine Villyard Dec 08 '14 at 15:34
  • @KatherineVillyard The permissions match those of 'check_nrpe', and that is definitely running. – dlyk1988 Dec 08 '14 at 15:35
  • @MadHatter You were right. When running './check_nrpe_hwinfo 192.168.10.13' I get '-bash: ./check_nrpe_hwinfo.sh: /bin/bash^M: bad interpreter: No such file or directory'. That is strange. – dlyk1988 Dec 08 '14 at 15:36

2 Answers2

4

You have misled yourself by not comparing apples with apples. The command you are running manually is not the command that you're asking NAGIOS to run automatically. When you run the actual command manually

check_nrpe_hwinfo.sh $HOSTNAME$ $HOSTADDRESS$

with appropriate substitutions, the problem comes to light. It appears to be that the file was transferred from a Windows box, and has dos-style line endings - which causes the shebang interpreter to get tetchy as you're asking it to launch an interpreter called bash^M. Run it through dos2unix, or take the terminal ^Ms out with vi or another binary-capable editor, and all should be well.

MadHatter
  • 79,770
  • 20
  • 184
  • 232
  • Will check first thing tomorrow morning and report back ASAP. – dlyk1988 Dec 09 '14 at 00:06
  • Turns out I had this issue with the file, and some more. Fixed it and it runs from the server as you pointed. But... I am now getting a "No handler..." even though there evidently is a handler since the check executes manually. – dlyk1988 Dec 09 '14 at 14:05
0

For debugging nagios checks awesome tool is PyNag https://github.com/pynag/pynag/wiki

Depends of your Distro you can get it from packages or from github

# cd to folder with nagios.cfg
cd /etc/nagios/

# run pynag to see what's actual command will be executed
# Usage: pynag execute <host_name> [service_description]

pynag execute my_windows_host1 "HW Info"
eject
  • 353
  • 1
  • 5