Questions tagged [consensus]

Consensus is the problem of reaching agreement among members of a group. Talking in terms of computing it is the agreement on a certain value that is needed for during computation by nodes that participate in a cluster. Reaching consensus in a distributed environment is a challenging task and under certain conditions not even possible. Consensus algorithms are often used for replicating a state machine as a general approach for enhancing fault-tolerance

In the context of distributed computing consensus is a fundamental problem that has received intense attention in research as problem like atomic compare-and-swap (CAS) registers or atomic transaction commit can be reduced to it. Furthermore consensus builds a foundation to realise applications like leader election, clock synchronisation and state machine replication. Although the problem can be explained in simple words, solving it is far more subtle and under certain circumstances not even possible.

In general consensus is the problem of reaching agreement among members of a group. Talking in terms of computing, it is the agreement on a certain value that is needed for during computation by nodes that participate in a cluster. This may not sound complicated but indeed gets as in a distributed system different types of unpredictable partial failures like:

loss of packets
duplication/reordering of packets
arbitrary delayed packet delivery
pause or even crash of a node

may happen and will happen, which impede solving the problem. Therefore a consensus algorithm needs to satisfy certain properties:

Agreement: All decisions by non faulty processors must be the same.
Integrity: No node decides twice.
Validity: If all nodes decide on value v, then v was proposed.
Termination: Every non-faulty node needs to eventually decide on a value.

Termination is a liveness property and requires that the algorithm is fault-tolerant. Without satisfying this property an algorithm could simply use a predefined leader that decides on values (like a 2PC). But if the leader fails the algorithm would not make progress anymore - hence the termination requirement. Agreement and integrity are safety properties and build the core of the consensus algorithm. Validity covers the trivial solution where nodes would always decide on the same pre given value and thus also reach consensus by definition.

Whether a solution, that is an actual algorithm, exist for the consensus problem, depends on the system model. Therefore one has to differentiate between a synchronous or asynchronous system model¹. Since mostly, in order to send messages, we have to rely on a shared communication channel, at least the communication is asynchronous and thus our real world applications fit that model better. Unfortunately in such kind of environment there is an impossibility proof by Fischer et al. (1), known as the FLP Impossibility, that shows in a fully asynchronous model even if just one node is faulty (not even considering byzantine failures), no algorithm exist that always reaches consensus. By proving that in every possible algorithm an execution exist that would never terminate, they have shown that no algorithm would always reach the point of agreement.

Nevertheless real solutions to this problem exist. Note that the impossibility result applies on fully asynchronous environments. Although our real world applications fit that model better, it does not mean all conditions are accurate (e.g processors do maintain an internal clock). Making less stringent assumptions about the model allows circumventing the result of the FLP Impossibility. Setting for example an upper bound for message delivery as an unreliable failure detection can turn the communication channel into a partial synchronous one. A system model in which the consensus problem is solvable. (2)

Algorithms that solve the consensus problem are: Paxos, Raft, Viewstamped Replication, Zab

¹ In synchronous models there are known bounds for message delivery and process execution. In an asynchronous model processors not even maintain an internal clock and hence no bounds exist.

303 questions

votes

3 answers

RAFT consensus protocol - Should entries be durable before commiting

I have the following query about implementation RAFT: Consider the following scenario\implementation: RAFT leader receives a command entry, it appends the entry to an in-memory array It then sends the entries to followers (with the heartbeat) The…

consensus raft

asked Apr 29 '14 at 07:31

coder_bro

10,503
13
56
88

votes

0 answers

paxos when acceptors change its value

In paxos algorithm,there is a description in wiki: Phase 2a: Accept Request If a Proposer receives enough promises from a Quorum of Acceptors, it needs to set a value to its proposal. If any Acceptors had previously accepted any proposal, then…

algorithm distributed paxos consensus leader

asked Feb 19 '14 at 07:32

user1957040

votes

2 answers

What is "value of the highest-numbered proposal" in the Paxos algorithm?

In Paxos made simple Lamport describes Phase 2 (a) of the algorithm as following: If the proposer receives a response to its prepare requests (numbered n) from a majority of acceptors, then it sends an accept request to each of those acceptors…

algorithm distributed paxos consensus

asked Nov 07 '13 at 13:33

daniero

votes

1 answer

How to calculate a minimum exchange time among multiple nodes?

This is about developing a distributed consensus. Let's assume we have N nodes ( N1, N2, ..., Nn ),each of the nodes has a different value( A1, A2, ..., An ).These nodes can communicate with each other, and replace their value if it's bigger then…

algorithm graph-theory distributed-computing theory consensus

asked Mar 15 '22 at 08:19

raindust

votes

2 answers

Is it feasible to implement a Raft/PBFT like (leader election between authorized nodes) consensus algorithm in the substrate?

I'm new to the substrate. I'm trying to improve the availability of block generation of permissioned networks by substituting Aura with Raft/PBFT like consensus algorithm. I'm referring to an algorithm that solves the "who can generate blocks"…

substrate consensus polkadot

asked Nov 16 '21 at 17:51

Lavoris

votes

1 answer

What if log replication out-of-order of etcd raft?

I'm the newbie in etcd and have some confusion points about log replication: For example, leader send out {term:2,index:3} and then {term:2,index:4}, the majority respond in order too. But due to network delay, leader receive the responses out of…

etcd consensus raft

asked Mar 30 '21 at 06:56

TAKCHI CHAN

votes

1 answer

Paxos understanding

I have read the paper Paxos made simple. And after hard thinking, I come to this conclusion: The Paxos protocol always guarantees the majority of servers accept the same value at the same turn of the proposal and so that finally, we can derive the…

distributed-computing distributed distributed-system consensus paxos

asked Feb 18 '21 at 12:12

梁雨生

votes

1 answer

What is the purpose of Chubby Sequencers

While reading article from google about chubby, I didn't really understand the purpose of sequencers Assume we have 4 entities : Chubby cell Client 1 Client 2 Service we want to use and where we will send the requests (for which we need the…

apache-zookeeper distributed-system consensus paxos

asked Nov 16 '19 at 13:32

newbie

votes

2 answers

how to prove a consensus implementation like multipaxos is right?

I want to prove that my implementation of multi-paxos is right. Are there any valid examples for me to test on? Or there can be some other ways to convince others that my implementation is right. I tried to find some paper that contained the…

consensus paxos proof-of-correctness

asked Aug 12 '19 at 13:29

biscuittown

votes

1 answer

How does Raft deals with delayed replies in AppendEntries RPC?

I came up with one question when reading the Raft paper. The scenario is followed. There are 3 newly started Raft instances, R1, R2, R3. R1 is elected as leader, with nextIndex {1, 1, 1}, matchIndex {0, 0, 0} and term 1. Now it receives command 1…

distributed distributed-system consensus raft

asked Jun 20 '19 at 01:00

pikatao

votes

1 answer

Is Paxos Strongly Consistent?

Consider a distributed system with 3 nodes- n1, n2, n3. There is a shared data, x, among the nodes. Paxos is running on the nodes. In the beginning, x is equal to 4. A client sends an update request to n1 to change the value of x to 5. n1 and n2…

concurrency distributed-computing distributed-system consensus paxos

asked Jun 08 '19 at 17:30

H.H

votes

2 answers

Commit Failure in Paxos

I am new to the Distributed System and Consensus Algorithm. I understand how it works but I am confused by some corner cases: when the acceptors received an ACCEPT for an instance but never heard back about what the final consensus or decision is,…

distributed-computing distributed distributed-system consensus paxos

asked Oct 11 '18 at 02:36

zzqtunaive

votes

1 answer

Do we need PBFT algorithm support in permissioned Block chain networks?

I am new to BCT. My question is why do we need a consensus algorithm such as PBFT in a permission based Block chain network where the nodes are trusted nodes. Is it only to find a way when nodes fail or is there any other use case. Can anyone…

blockchain fault-tolerance consensus

asked Apr 03 '18 at 09:46

Satya Narayana

votes

1 answer

How to use PBFT as consensus protocol in Hyperledger fabric 1.0?

How to use PBFT as consensus protocol in Hyperledger fabric 1.0? What are the configurations required while setting up hyperledger fabric blockchain so that it uses PBFT as consensus algorithm?

hyperledger-fabric hyperledger consensus

asked Nov 29 '17 at 07:08

pj2494

votes

0 answers

How to create a distributed lock using redis cluster

background I realize a redis client(support cluster) and an issue is raised for the support of distributed lock made by redis cluster. I've read post of redlock algorithm and related debate problems Actually it is impossible to make one key hashed…

redis distributed-computing consensus

asked Sep 27 '17 at 03:06

Chen MIng

Prev 1 2 3

…

20 21 Next