2

Can someone answer? what is the failover time if a node goes down, I did see it take 2 to 3 seconds but do we have control to make it faster than this? if yes how much. Typically, we cannot take this hit on the application queue it is that sensitive.

sai
  • 165
  • 1
  • 10

1 Answers1

2

In YugabyteDB, tables are shared into tablets, and tablets are replicated among nodes using Raft (a distributed consensus protocol). Raft is also used to elect, for each tablet, one of the tablet's peers as the leader.

In a typical situation, a node will have many tablets-- some in follower role, and some in leader role. When a node fails, tablets for which this node is a leader can have a small amount of unavailability until new leaders are elected for those tablets. (Note: YugabyteDB is a CP database). This leader (re)election is triggered when the followers of a tablet have not heard from their leader for a certain number of heartbeats. The knobs that govern this, and hence determine the failover time, are the following gflags:

raft_heartbeat_interval_ms (default 500ms)

leader_failure_max_missed_heartbeat_periods (6)

In other words, by default, if a follower doesn't hear for 6 heartbeats from the leader, then after 6 * ~500 ms (the default heartbeat interval), i.e. about 3 seconds, new leaders will get elected.

It is possible to override the above settings to reduce the failover time. However, care must be exercised to not make this too aggressive, as that can cause leaders to unnecessarily ping/pong even for small network hiccups.