3

I am using the following pdf as reference.enter image description here

It says that lastApplied is the highest log entry applied to state machine, but how is that any different than the commitIndex?

Also is the matchIndex on leader just the commitIndex on followers? If not what is the difference?

Jal
  • 2,174
  • 1
  • 18
  • 37

3 Answers3

4

Your observation is reasonable: most of the time, nextIndex equals matchIndex + 1, but it is not always the case.

For example, when a leader is initiated, matchIndex is initiated to the 0, while nextIndex is initiated to the last log index + 1.

The difference here is because these two fields are used for different purposes: matchIndex is an accurate value indicating the index up to which all the log entries in leader and follower match. However, nextIndex is only an optimistic "guess" indicating which index the leader should try for the next AppendEntries operation, it can be a good guess (i.e. it equals matchIndex + 1) in which case the AppendEntries operation will succeed, but it can also be a bad guess (e.g. in the case when a leader was just initiated) in which case the AppendEntries will fail so that the leader will decrement nextIndex and retry.

As for lastApplied, it's simply another accurate value indicating the index up to which all the log entries in a follower have been applied to the underlying state machine. It's similar to matchIndex in that they both are both accurate values instead of heuristic "guess", but they really mean different things and serve for different purposes.

Lifu Huang
  • 11,930
  • 14
  • 55
  • 77
2

... lastApplied is the highest log entry applied to state machine, but how is that any different than the commitIndex?

These are different in a practical system because the component that commits the data in the log is typically separate from the component that applies it to replicated state machine or database. The commitIndex is typically just nanoseconds or maybe a few milliseconds more up-to-date than lastApplied.

Is the matchIndex on leader just the commitIndex on followers? If not what is the difference?

They are different. There is a period of time when the data is on a server and not yet committed, such as during the replication itself.

The leader keeps track of the latest un-committed data on each of its peers and only need to send log[matchIndex[peer], ...] to each peer instead of the whole log. This is especially useful if the peer is significantly behind the leader; because the leader can update the peer with a series of small AppendEntries calls, incrementally bringing the peer up to date.

Michael Deardeuff
  • 10,386
  • 5
  • 51
  • 74
  • big thanks for the answer, another quick question, is the information in `matchIndex` already captured in `nextIndex`? Since `nextIndex - 1 ` should be the `matchIndex` – Jal Sep 29 '17 at 21:11
-1
  1. commit is not mean already applied, there is time different between them. but eventually applied will catch up commit index.
  2. matchIndex[i] which is saved in leader is equal to follower_i's commitIndex, and they are try to catch up to nextIndex
Aelous
  • 1
  • No, `matchIndex[i]` is often **not** equal to follower `i`'s `commitIndex`. `matchIndex[i]` is the top index that has been replicated from leader to follower `i`. But there may be a majority of followers who haven't replicated that index yet, which means it's not yet committed. `commitIndex` (on leader, follower `i`, or both) may be much lower than `matchIndex[i]`. – jcsahnwaldt Reinstate Monica Dec 04 '22 at 06:15