0

When using a distributed lock library for example Consul to implement leader election ( there are 2 nodes consuming messages. However only 1 of them must acquire a lease on a distributed lock and become leader and actually process messages. The other node must remain passive and become active only if the currently active node dies or becomes unhealthy or unresponsive) how do we deal with a scenario where the current leader has been deposed in Consul and the lock it held is released and the backup/passive node becomes the new leader. But for some reason the previous leader is not aware it has been deposed and still thinks it is the leader.

So we now have a situation where there are 2 leaders. This may happen for example due to slow networks/garbage collection etc,the previous leader is slow and thinks its lease on the distributed lock it holds has not expired, whereas in consul the ttl on the lock has expired and therefore the lock is released.

Consul does have the concept of a watch handler which is basically a callback function within each node that is called whenever a new leader is elected. But if there is some delay in the callback function being called or for some reason the callback does not happen, then there could still be 2 leader.

One way could be to do a check to make sure that a node that thinks it is the leader, is indeed the leader before it is permitted to do tasks that must be done only by a leader. Is this check very expensive to do? Are there any other ways of ensuring that there is absolutely only 1 leader at any point in time?

0 Answers0