1

I want to know where I can configure the limit of skew time for my ceph monitors. Also,how does Ceph throw this clock skew error? Specifically from which file and where can I find the file so that I can edit?

I am already running NTP and everything's working fine and I am not getting any skew errors.

I want to write a shell script from which I could get a mail when this skew error occurs or has reached a threshold of my customization.

0xF2
  • 314
  • 3
  • 17

1 Answers1

0

To cite the documentation:

This value is configurable via the mon-clock-drift-allowed option, and although you CAN it doesn’t mean you SHOULD.

So keep the default. Really. It is advised to use chrony for time synchronisation. You have to get the time synchronisation right, because it is vital for the monitors.

Within our cluster, a monitor that is rebooted will be skewing for like 30 seconds before chrony will have synchronised the time of the server. The cluster will then reach HEALTH OK.

Every ceph node should have chrony running. Even on non-monitor nodes, just for the sake of logging. If chrony can't reach upstream time sources, you can establish your own synchronised time with that tool.

More details on the monitors:

The monitors are keeping track of the cluster. The PGs, the placement of the objects within the PGs. All this data is kept in sync in the monitors. The synchronisation process is most probably very time sensitive since Ceph is trying to achieve a low latency network storage. I can't give any pointers to the bad things that could happen with larger time drift, but I sense trouble.

itsafire
  • 5,607
  • 3
  • 37
  • 48
  • mon-clock-drift-allowed i know that what i am asking is where the value of drift file is passed in order to perform check, for example mon1 - mon2 –  Zeeshan Haris Dec 13 '19 at 11:30