Questions tagged [raft]

A distributed consensus protocol designed to be easy to understand. It's equivalent to Paxos in fault-tolerance and performance.

Click for source

Raft is a consensus algorithm that is designed to be easy to understand. It's equivalent to Paxos in fault-tolerance and performance. The difference is that it's decomposed into relatively independent subproblems, and it cleanly addresses all major pieces needed for practical systems. We hope Raft will make consensus available to a wider audience, and that this wider audience will be able to develop a variety of higher quality consensus-based systems than are available today.

Consensus is a fundamental problem in fault-tolerant distributed systems. Consensus involves multiple servers agreeing on values. Once they reach a decision on a value, that decision is final. Typical consensus algorithms make progress when any majority of their servers are available; for example, a cluster of 5 servers can continue to operate even if 2 servers fail. If more servers fail, they stop making progress (but will never return an incorrect result).

Consensus typically arises in the context of replicated state machines, a general approach to building fault-tolerant systems. Each server has a state machine and a log. The state machine is the component that we want to make fault-tolerant, such as a hash table. It will appear to clients that they are interacting with a single, reliable state machine, even if a minority of the servers in the cluster fail. Each state machine takes as input commands from its log. In our hash table example, the log would include commands like set x to 3. A consensus algorithm is used to agree on the commands in the servers' logs. The consensus algorithm must ensure that if any state machine applies set x to 3 as the nth command, no other state machine will ever apply a different nth command. As a result, each state machine processes the same series of commands and thus produces the same series of results and arrives at the same series of states.

259 questions

votes

1 answer

How does Raft deals with delayed replies in AppendEntries RPC?

I came up with one question when reading the Raft paper. The scenario is followed. There are 3 newly started Raft instances, R1, R2, R3. R1 is elected as leader, with nextIndex {1, 1, 1}, matchIndex {0, 0, 0} and term 1. Now it receives command 1…

asked Jun 20 '19 at 01:00

pikatao

votes

2 answers

Can RAFT as a protocol support only leader election?

I need to execute certain jobs in regular intervals (say every min). If a single node does this, we have a single point of failure. To avoid this, I was thinking of following scheme: 1. Nodes form a raft cluster, with leader election 2. Only the…

distributed-system raft

asked Jan 16 '18 at 01:31

coder_bro

10,503
13
56
88

votes

3 answers

What is lastApplied and matchIndex in raft protocol for volatile state in server?

I am using the following pdf as reference. It says that lastApplied is the highest log entry applied to state machine, but how is that any different than the commitIndex? Also is the matchIndex on leader just the commitIndex on followers? If not…

algorithm distributed-system consensus raft

asked Sep 23 '17 at 05:14

Jal

2,174
1
18
37

votes

2 answers

Raft nodes count

Raft leader node sends append entries RPC to all followers. Obviously we increase network usage, when we add new follower, so my question is about how much nodes we can add to cluster. In Raft paper and in other places I read that 5 nodes in cluster…

algorithm consensus raft

asked Feb 23 '17 at 17:09

gigovich

votes

1 answer

Raft: How to solve the performance bottleneck of leader node?

In raft, all operation requests will be forwarded to the leader node, and then the leader will send logs to all followers. So under a heavy loaded environment, the leader node will be a bottleneck. How to solve this?

algorithm distributed consensus raft

asked Oct 17 '16 at 03:36

huron

votes

1 answer

raft: some questions about read only queries

In the raft's thesis document chapter 6.4, it gives steps to bypass the Raft log for read-only queries and still preserve linearizability: If the leader has not yet marked an entry from its current term committed, it waits until it has done so.…

raft

asked May 13 '16 at 10:31

kingluo

1,679
1
13
31

votes

2 answers

What will happen to replicated but uncommited logs in raft protocol

Suppose a 3-member raft cluster a[master],b,c Client sends the log to a, a replicates it to b and c, a apply the log to the status machine and response to client. Then a crashes before b and c have a chance to replicate the committed state to b…

consensus raft

asked Jan 08 '16 at 08:15

bigwesthorse

votes

0 answers

6.824 raft applyCh deadlock

There is a bug of my 6.824 raft implementation, it fails to pass TestSnapshotAllCrash2D, but passed all before. the programe stucked while trying to apply, just like: ... [Term 1 Server 2] starting new command in 11 for…

go raft

asked Apr 05 '23 at 09:53

sakamoto

votes

1 answer

How does Raft guarantee that a leader can always be elected?

The Raft paper says: Raft uses the voting process to prevent a candidate from winning an election unless its log contains all committed entries. A candidate must contact a majority of the cluster in order to be elected, which means that every…

replication distributed-computing distributed consensus raft

asked Apr 08 '22 at 18:57

user9845

votes

1 answer

How does Raft handle a prolonged network partition?

Consider that we are running Raft on 3 machines: A, B, C and let A be the leader. There is a network partition that splits C, from A, B. Call the current term t. A and B remain on term 2, with no additional messages besides periodic heartbeats. At…

networking distributed consensus raft

asked Jan 12 '22 at 00:06

Andy

votes

1 answer

How to setup Hashicorp Vault with consul on Openshift

I implemented the Hashicorp Vault with the raft, but my organization wants now to change the raft to consul like remove the present vault cluster and re-install with consul but I found in the official Hashicorp documentation as the given…

openshift consul hashicorp-vault raft

asked Oct 27 '21 at 15:22

Shaik Nasrulla Sharif

votes

1 answer

If the ledger of peer is tampered in hyperledger fabric network, how can be the previous state of ledger restored?

My hyperledger fabric network consists of 1 orderer, 1 organization and 3 peers. I tampered ledger of first peer and then tried to do another transaction on the same peer, then the following error was thrown: "Error: deliver completed with status…

hyperledger-fabric blockchain hyperledger-chaincode raft tampering

asked Aug 03 '21 at 15:45

Kushal Mahajan

votes

0 answers

How to run TLC checker on Raft's TLA+?

I want to run Raft's TLA+ implementation, so I build a new Module, and set up like the following: However, TLC generates lots of states, and it seems that it will never stop. And it occur to me that maybe I should limit the length of messages and…

raft tla+

asked Jul 28 '21 at 09:29

calvin

2,125
2
21
38

votes

2 answers

Only Clients can write to RAFT leader -- choke point

I dont see a huge advantage of RAFT in implementing distributed DBs. If the clients can only write to the leader then that leader still becomes the choke point -- or the single point of failure. Ideally, I would want a way in which multiple clients…

database distributed-computing distributed-system raft

asked Jun 14 '21 at 03:19

Abhishek

votes

1 answer

What happens if I add a new node to a CockroachDB cluster with more storage than existing nodes?

Having a 3x 1TB CockroachDB cluster, what happens if I add a single 4 TB node? Presumably only some of the 4TB can be used as not all can be replicated? If I add 3 new nodes with 4TB each, can all disk space be replicated/used?

distributed-computing cockroachdb raft

asked Mar 10 '21 at 11:22

Tobias Mühl

1,788
1
18
30

Prev 1 2 3

…

17 18 Next