3

I'm troubleshooting a problem on one of my NTP servers. This server seems to defy logic, and I'm at my wits end trying to troubleshoot it. Roughly every 162 seconds, the system clock reverts to (current time - 112 seconds), despite the hardware clock being accurate. This happens whether ntpd is running or not, and when the clock skews, the hardware clock still reports accurate time (until some later time, when the hwclock is synced back to the system clock).

I put together a small one-liner to demonstrate the problem:

$ date ; sudo /sbin/service ntpd stop ; date; sudo ntpdate -u time.nist.gov ; sudo /sbin/hwclock --systohc  --utc ; sudo /sbin/hwclock --hctosys --utc ; i=0 ; ss=$(/bin/date +%s) ; while [ $i -lt 240 ] ; do date ; ts=$(($(/bin/date +%s)-$ss)) ; /sbin/hwclock --show --utc ; echo "seconds since last sync: $ts" ; sleep 1 ; ((i++)) ; done
Tue Jan  6 03:44:41 UTC 2015
Shutting down ntpd:                                        [  OK  ]
Tue Jan  6 03:44:41 UTC 2015
 6 Jan 03:46:34 ntpdate[13092]: step time server 24.56.178.140 offset 112.261660 sec
Tue Jan  6 03:46:37 UTC 2015
Tue 06 Jan 2015 03:46:38 AM UTC  -0.994306 seconds
seconds since last sync: 0
Tue Jan  6 03:46:39 UTC 2015
Tue 06 Jan 2015 03:46:40 AM UTC  -0.995661 seconds
seconds since last sync: 2
Tue Jan  6 03:46:41 UTC 2015
Tue 06 Jan 2015 03:46:42 AM UTC  -0.995526 seconds
seconds since last sync: 4
Tue Jan  6 03:46:43 UTC 2015
Tue 06 Jan 2015 03:46:44 AM UTC  -0.995515 seconds
seconds since last sync: 6
Tue Jan  6 03:46:45 UTC 2015
Tue 06 Jan 2015 03:46:46 AM UTC  -0.995465 seconds
seconds since last sync: 8
Tue Jan  6 03:46:47 UTC 2015
Tue 06 Jan 2015 03:46:48 AM UTC  -0.995293 seconds
seconds since last sync: 10
Tue Jan  6 03:46:49 UTC 2015
Tue 06 Jan 2015 03:46:50 AM UTC  -0.995207 seconds

This goes on for a little bit, but eventually the system clock jumps backwards 112 seconds:

Tue Jan  6 03:47:07 UTC 2015
Tue 06 Jan 2015 03:47:08 AM UTC  -0.995297 seconds
seconds since last sync: 30
Tue Jan  6 03:45:16 UTC 2015
Tue 06 Jan 2015 03:47:10 AM UTC  -0.995259 seconds
seconds since last sync: -81
Tue Jan  6 03:45:18 UTC 2015
Tue 06 Jan 2015 03:47:12 AM UTC  -0.996067 seconds
seconds since last sync: -79
Tue Jan  6 03:45:20 UTC 2015
Tue 06 Jan 2015 03:47:14 AM UTC  -0.996148 seconds
seconds since last sync: -77

some minor details: This system is running CentOS 5.11, is on bare-metal (not a VM), uptime is 23 days, and this problem started a couple months ago. I've not really had time to look at it until now, so I'm not sure if there was some correlating update/activity with this server when it started.

So, I guess my question is what else besides ntpd could be updating the system clock? I have verified that there are no cron jobs scheduled that touch the clock, and as far as I can tell no running daemons should be touching the clock either.

ben
  • 191
  • 4
  • Does anything appear in /var/log/messages when the time jump happens? Can you identify when this started and if so do your /var/log/yum.log* files go back far enough to show any packages added/changed around that time. Obviously not the only route to system changes but it's the obvious one. – Paul Haldane Jan 06 '15 at 08:09
  • If you leave the system in this state does it stay at that offset (112 seconds from reality)? If it does then that gives a strong steer that something is synchronising to an external source (which is 112 seconds slow) since hardware clock is (roughly) correct. I can't think of any other mechanism for this unless something really odd is happening. – Paul Haldane Jan 06 '15 at 08:17

1 Answers1

2

Turns out it was time drift on the Active Directory Domain Controllers this host was tied to. Fixed time on the DCs and set them to sync to the ntp server. The version of likewise the hosts were using didn't update any logs indicating the time change, which is what made this so hard to troubleshoot.

ben
  • 191
  • 4
  • 1
    Ben: +1 from me on both question (a textbook well-written one, for my money) and for coming back to post your own answer when you'd run the problem to ground. This may help any number of people in the future - *thank you*! – MadHatter Jan 31 '15 at 07:24