I have a local NTP server running on the subnet to keep other subnet nodes in sync, without every node syncing with upstream servers. But, while implementing the check_ntp_time
plugin for Nagios I am noticing a frustrating issue, where nagios keeps reporting critical error for local nodes syncing up with the local ntp server.
Here is the ntp config on the local ntp server, notice the upstream server entries and the restrict entry, according to my research this qualifies the node as an ntp server which local nodes can sync against.
driftfile /var/lib/ntp/drift
# Permit time synchronization with our time source, but do not
# permit the source to query or modify the service on this system.
restrict default kod limited nomodify notrap nopeer noquery
restrict -6 default kod limited nomodify notrap nopeer noquery
# Permit all access over the loopback interface. This could
# be tightened as well, but to do so would effect some of
# the administrative functions.
restrict 127.0.0.1
restrict -6 ::1
# Makes me able to answer requests from local nodes
restrict 10.0.0.0 mask 255.255.192.0 nomodify notrap
# My source
server 0.centos.pool.ntp.org iburst
server 1.centos.pool.ntp.org
server 2.centos.pool.ntp.org
logfile /var/log/ntp/server.log
statistics loopstats
statsdir /var/log/ntp/
filegen peerstats file peers type day link enable
filegen loopstats file loops type day link enable
And on the local non-ntp server nodes, everything is the same except the restrict entry is removed, and the server entries reference only the local ntp server: server ntp.example.com iburst
.
Every local node can resolve ntp.example.com
.
The problem I am having is when I run the following command from the nagios server:
/usr/lib64/nagios/plugins/check_ntp_time -H node-a-1 -v
And the output:
sending request to peer 0
response from peer 0: offset -0.002921819687
sending request to peer 0
response from peer 0: offset -0.0001939535141
sending request to peer 0
re-sending request to peer 0
re-sending request to peer 0
re-sending request to peer 0
re-sending request to peer 0
re-sending request to peer 0
re-sending request to peer 0
discarding peer 0: stratum=0
overall average offset: 0
NTP CRITICAL: Offset unknown|
This happens for all the nodes, except the local ntp server, which references upstream servers. At first I thought it was IPTables issue, but I have the ports pinholed on every local ntp node (to allow nagios access to check the time diff):
ACCEPT udp -- eth0 * 10.0.0.0/18 0.0.0.0/0 multiport dports 123 /* 777 allow ntp access */ state NEW
Versions:
nagios-plugins-ntp: 1.4.16
ntp: 4.2.6p5-1.el6.centos
Any help is greatly appreciated, I really can't submit the nagios work until I get this resolved, as you know keeping server times in sync is priority 1.
-- Edit --
Per the comments, here are the results of ntpq -p
, on various nodes:
# Actual NTP Server (10.0.0.2)
==============================================================================
+propjet.latt.ne 241.199.164.101 2 u 105 128 337 14.578 12.954 7.138
+x2la01.hostigat 63.145.169.2 3 u 21 128 377 16.037 13.546 4.090
*pacific.latt.ne 241.199.164.101 2 u 72 128 377 15.148 24.434 7.403
# Local node 1
==============================================================================
*service-a-1.sn1 204.2.134.163 3 u 9 128 377 0.228 5.217 1.296
# Local node 2
==============================================================================
*service-a-1.sn1 204.2.134.163 3 u 91 128 377 0.200 3.608 1.167