7

I'ved noticed that munin graphs a few bits of information about timing/kernel statistics that I've never quite understood. Most of my servers seem to stay close to 0, which I presume is good, but one of them is slowly but steadily getting more and more negative on one of the graphs.

Munin graphs the following statistics over time:

  • NTP kernel PLL estimated error (secs)
  • NTP kernel PLL frequency (ppm + 0)
  • NTP kernel PLL offset (secs)
  • NTP timing statistics for system peer

Here's an example from munin's docs: http://demo.munin-monitoring.org/time-year.html

Searching around the web for a concise, understandable definition of NTP turns up nothing (except for a bunch of Nagios and Munin graphs), and searching Server Fault turns up a ton of answers that presume the reader knows something about NTP already.

Stack Overflow defines it thusly:

NTP stands for Network Time Protocol, and it is an Internet protocol used to synchronize the clocks of computers to some time reference.

But that seems a little obtuse—does this affect, say, a web server, encryption, or database synchronization?

What is NTP, and why should I care? Are there any stats in particular I should make sure don't get out of control?

geerlingguy
  • 1,357
  • 2
  • 17
  • 29
  • 1
    Which one is steadily decreasing? If PLL frequency, it's just the crystal aging (or temperature changes). – David Schwartz Sep 14 '12 at 04:05
  • That's the one. It used to go + and - a lot, now it is just really slowly going into the negative territory. Currently at '-20.0'. – geerlingguy Sep 14 '12 at 17:55

1 Answers1

3

NTP is a protocol that synchronizes the system clock (usually there is a daemon running on *nix boxes). In short, it makes sure that the time on the server is correct. There are many reasons it is important to have accurate time:

  • Some authentication schemes (such as kerberos, AD auth) count on the system having correct time
  • When you troubleshoot things, having accurate time stamps in the logs can be vital
  • Many applications that run on a server might use the system time to generation information they show to the user. Depending on the application, time can be critical (for example, knowing when a financial transaction happened)

I'm sure there are others, but having accurate system time is a standard responsibility of a system administrator. NTP does a lot of sophisticated things to this end (accounting and correcting for drifts etc). So those details statistics can help you troubleshoot any issues that arrise in fulfilling this role.

Kyle Brandt
  • 83,619
  • 74
  • 305
  • 448
  • 1
    Especially poignant after reading http://www.theregister.co.uk/2012/07/02/leap_second_crashes_airlines/ - time is hard to get right... even on the microsecond level! – geerlingguy Sep 14 '12 at 17:57