How does Raft deals with delayed replies in AppendEntries RPC?

Question

I came up with one question when reading the Raft paper. The scenario is followed. There are 3 newly started Raft instances, R1, R2, R3. R1 is elected as leader, with nextIndex {1, 1, 1}, matchIndex {0, 0, 0} and term 1. Now it receives command 1 from the client and the logs of the instances are as follow:

R1: [index 0, command 0], [index 1, command 1]
R2: [index 0, command 0]
R3: [index 0, command 0]

What if the network is not reliable? If R1 is sending this log to R2 but the AppendEntries RPC times out, the leader R1 has to resend the [index 1, command 1] again. Then it may receive replies{term: 1, success: true} twice.

The paper says:

If last log index ≥ nextIndex for a follower: send AppendEntries RPC with log entries starting at nextIndex
• If successful: update nextIndex and matchIndex for follower (§5.3)
• If AppendEntries fails because of log inconsistency: decrement nextIndex and retry (§5.3)

So the leader R1 will increse nextIndex and matchIndex twice: nextIndex {1, 3, 1}, matchIndex {0, 2, 0}, which is not correct. When the leader sends the next AppendEntries RPC, i.e., a heartbeat or log replication, it can fix the nextIndex, but the matchIndex will never have a chance to be fixed.

My solution is to add a sequence number to both AppendEntries arguments and results for every single RPC calls. However, I was wondering if there is a way to solve this problem only with the arguments given by the paper, that is, without the sequence number.

Any advice will be appreciated and thank you in advance.

score 4 · Accepted Answer · answered Jun 20 '19 at 01:10

The protocol assumes that there’s some context with respect to which AppendEntries RPC a follower is responding to. So, at some level there does need to be a sequence number (or more accurately a correlation ID), whether that be at the protocol layer, messaging layer, or in the application itself. The leader has to have some way to correlate the request with the response to determine which index a follower is acknowledging.

But there’s actually an alternative to this that’s not often discussed. Some modifications of the Raft protocol have the followers send their last log index in responses. You could also use that last log index to determine which indexes have been persisted on the follower.

thank you so much! both hidden context and alternation are helpful to me! — pikatao, Jun 20 '19 at 20:35

How does Raft deals with delayed replies in AppendEntries RPC?

1 Answers1