0

I'm learning Raft from the paper's extended version. In section 5.2 (Leader Election) of the paper, it says:

If a follower receives no communication over a period of time called the election timeout, then it assumes there is no viable leader and begins an election to choose a new leader.

At the same time, the paper says in some cases an RPC can be rejected, for example when it contains a smaller term number.

My question is: when should a follower recognize an RPC as a valid "communication" and record it to prevent itself from timing out?


Edit:

My current implementation is as follows:

  • RequestVote resets the timeout only when the server grants vote
  • AppendEntries resets the timeout if its term is no smaller than the server's

This works fine in most cases, but sometimes causes a long election. Consider a Raft cluster with 2 servers, both followers. Server #1 has a more up-to-date log, but server #2 has a larger term.

In this setting, server #1 has to continuously start 2 elections to become a leader, which (intuitively) happens with <50% probability. If server #2 starts an election and timeouts, its term increases and the next election by server #1 will fail again. In practice this can cause the whole election to last for several seconds even if there are only a few servers. I wonder if there are some approaches to solve this problem (or if this is in fact not a problem).

IcicleF
  • 61
  • 5

1 Answers1

0

A Raft node that is serving as a Follower responds to two types of requests:

  • AppendEntries from the Leader
  • RequestVote from a Candidate

If a Follower receives an AppendEntries from the current Leader, it should do all the checks (ie. term from the request, log matching) and if all the checks are satisfied, the Follower should append received entries from the request. The follower should also reset the election timeout when receiving AppendEntries from the current Leader because the AppendEntries also serves as a heartbeat (Leaders also send periodic AppendEntries requests with no logs in order to prevent Follower from timing out and starting a new election).

If a Follower receives a RequestVote RPC, and if the Follower decides to grant its vote to that Candidate, the Follower will also reset its election timeout.

msantl
  • 371
  • 2
  • 6
  • Thanks for the answer. This is very similar to my implementation, but I am still confused about some points. The first is whether we should reset the timeout if we reject an AppendEntries only because the previous log entry does not match. The second is I think sometimes this implementation can result in a long election process (please see my Edit). – IcicleF Apr 06 '21 at 02:43
  • Good point, my answer is not clear about that. A follower should reset the election timeout on an AppendEntries from the current leader before checking the log matching property, because the Follower will reject that AppendEntries RPC and the Leader will use that rejection to decrement the `nextIndex` value in order to handle inconsistencies by forcing the Followers logs to duplicate its own. I've edited the answer to be more clear about that. – msantl Apr 06 '21 at 10:57
  • In the Raft paper, Section 5.6 Timing and availability, describes the election timeout implementation and how to reduce the leader election phase by choosing election timeouts are randomly from a fixed interval (e.g., 150–300ms). You can try optimizing your setup with the interval for the election timeouts. Ideally you would never have a Raft cluster of 2 nodes, so the cluster would converge faster. – msantl Apr 06 '21 at 11:04