I recently started to use Nagios to monitor about 25 servers (mainly virtual, with some standalone). Them majority of the servers (including the Nagios host itself) are running Ubuntu 14.04 LTS, with a few running 12.04 LTS. Thus, I thought I could just utilize NRPE and be done with it.
Configuring NRPE has proven to be rather complex for me. For instance, for a simple check_disk command, I had to manually specify which partition to check by excluding every other partition/filesystem, as shown below:
command[check_disk]=/usr/lib/nagios/plugins/check_disk -w 57% -x /dev -x /run -x /run/lock -x /run/shm -x /run/user -x /sys/fs/cgroup
Otherwise my thresholds for warning and critical were immediately set off by sysfs, proc, or other partitions.
Then I took a look at the base service monitor that the Nagios host performs on itself. That is listed inside /usr/local/nagios/etc/localhost.cfg, and contains the following (I'm sorry! I don't understand why it won't properly format!)
define service{
use local-service ; Name of service template to use
host_name localhost
service_description PING
check_command check_ping!100.0,20%!500.0,60%
}
define service{
use local-service ; Name of service template to use
host_name localhost
service_description Root Partition
check_command check_local_disk!20%!10%!/
}
define service{
use local-service ; Name of service template to use
host_name localhost
service_description Current Users
check_command check_local_users!20!50
}
define service{
use local-service ; Name of service template to use
host_name localhost
service_description Total Processes
check_command check_local_procs!250!400!RSZDT
}
define service{
use local-service ; Name of service template to use
host_name localhost
service_description Current Load
check_command check_local_load!5.0,4.0,3.0!10.0,6.0,4.0
}
define service{
use local-service ; Name of service template to use
host_name localhost
service_description Swap Usage
check_command check_local_swap!20!10
}
define service{
use local-service ; Name of service template to use
host_name localhost
service_description SSH
check_command check_ssh
notifications_enabled 0
}
define service{
use local-service ; Name of service template to use
host_name localhost
service_description HTTP
check_command check_http
notifications_enabled 0
}
Which results in this on the dashboard:
This is PERFECT for me. This is exactly what I want every single host I add to show. Rather than messing around with custom commands, how exactly should I "copy" this to each host through the NRPE conf file so that I see all these specific services for each host I add? It's clear this is already here and already functions on the localhost. I'm struggling to wrap my head around the organization needed to make this happen.
Thank you for any and all advice.