Questions tagged [consensus]

Consensus is the problem of reaching agreement among members of a group. Talking in terms of computing it is the agreement on a certain value that is needed for during computation by nodes that participate in a cluster. Reaching consensus in a distributed environment is a challenging task and under certain conditions not even possible. Consensus algorithms are often used for replicating a state machine as a general approach for enhancing fault-tolerance

In the context of distributed computing consensus is a fundamental problem that has received intense attention in research as problem like atomic compare-and-swap (CAS) registers or atomic transaction commit can be reduced to it. Furthermore consensus builds a foundation to realise applications like leader election, clock synchronisation and state machine replication. Although the problem can be explained in simple words, solving it is far more subtle and under certain circumstances not even possible.

In general consensus is the problem of reaching agreement among members of a group. Talking in terms of computing, it is the agreement on a certain value that is needed for during computation by nodes that participate in a cluster. This may not sound complicated but indeed gets as in a distributed system different types of unpredictable partial failures like:

  • loss of packets
  • duplication/reordering of packets
  • arbitrary delayed packet delivery
  • pause or even crash of a node

may happen and will happen, which impede solving the problem. Therefore a consensus algorithm needs to satisfy certain properties:

  • Agreement: All decisions by non faulty processors must be the same.
  • Integrity: No node decides twice.
  • Validity: If all nodes decide on value v, then v was proposed.
  • Termination: Every non-faulty node needs to eventually decide on a value.

Termination is a liveness property and requires that the algorithm is fault-tolerant. Without satisfying this property an algorithm could simply use a predefined leader that decides on values (like a 2PC). But if the leader fails the algorithm would not make progress anymore - hence the termination requirement. Agreement and integrity are safety properties and build the core of the consensus algorithm. Validity covers the trivial solution where nodes would always decide on the same pre given value and thus also reach consensus by definition.

Whether a solution, that is an actual algorithm, exist for the consensus problem, depends on the system model. Therefore one has to differentiate between a synchronous or asynchronous system model1. Since mostly, in order to send messages, we have to rely on a shared communication channel, at least the communication is asynchronous and thus our real world applications fit that model better. Unfortunately in such kind of environment there is an impossibility proof by Fischer et al. (1), known as the FLP Impossibility, that shows in a fully asynchronous model even if just one node is faulty (not even considering byzantine failures), no algorithm exist that always reaches consensus. By proving that in every possible algorithm an execution exist that would never terminate, they have shown that no algorithm would always reach the point of agreement.

Nevertheless real solutions to this problem exist. Note that the impossibility result applies on fully asynchronous environments. Although our real world applications fit that model better, it does not mean all conditions are accurate (e.g processors do maintain an internal clock). Making less stringent assumptions about the model allows circumventing the result of the FLP Impossibility. Setting for example an upper bound for message delivery as an unreliable failure detection can turn the communication channel into a partial synchronous one. A system model in which the consensus problem is solvable. (2)

Algorithms that solve the consensus problem are: Paxos, Raft, Viewstamped Replication, Zab


1 In synchronous models there are known bounds for message delivery and process execution. In an asynchronous model processors not even maintain an internal clock and hence no bounds exist.

303 questions
3
votes
3 answers

What is lastApplied and matchIndex in raft protocol for volatile state in server?

I am using the following pdf as reference. It says that lastApplied is the highest log entry applied to state machine, but how is that any different than the commitIndex? Also is the matchIndex on leader just the commitIndex on followers? If not…
Jal
  • 2,174
  • 1
  • 18
  • 37
3
votes
1 answer

Why do Docker overlay networks require consensus?

Just been reading up on Docker overlay networks, very cool stuff. I just can't seem to find an answer to one thing. According to the docs: If you install and use Docker Swarm, you get overlay networks across your manager/worker hosts automagically,…
smeeb
  • 27,777
  • 57
  • 250
  • 447
3
votes
3 answers

What does each definition in configtx.yaml means in Hyperledger fabric v1.0?

This is related to Hyperledger fabric v1.0 network topology. From the example, configtx.yaml contains following definitions: Profiles: TwoOrgsOrdererGenesis: Orderer: <<: *OrdererDefaults Organizations: …
3
votes
2 answers

Raft nodes count

Raft leader node sends append entries RPC to all followers. Obviously we increase network usage, when we add new follower, so my question is about how much nodes we can add to cluster. In Raft paper and in other places I read that 5 nodes in cluster…
gigovich
  • 33
  • 1
  • 4
3
votes
1 answer

Can the Hyperledger Fabric Consensus Service be distributed?

I've read the fabric proposal on consensus architecture with interest, and I have a question about the consensus service. It seems to me that this is effectively a single service that guarantees all peers receive blocks in an order it decides. As…
Rich N
  • 8,939
  • 3
  • 26
  • 33
3
votes
2 answers

R NMF package: How to extract sample classifications?

In the NMF R-package one can use consensusmap() to visualise outputs. The plots show which samples belong to which clusters in the "consensus" track. I would like to extract this sample classification such that I get a data frame like this: Sample …
Esben Eickhardt
  • 3,183
  • 2
  • 35
  • 56
3
votes
1 answer

Raft: How to solve the performance bottleneck of leader node?

In raft, all operation requests will be forwarded to the leader node, and then the leader will send logs to all followers. So under a heavy loaded environment, the leader node will be a bottleneck. How to solve this?
huron
  • 762
  • 1
  • 5
  • 22
3
votes
1 answer

Counting differences from the consensus in each row via Pandas

I have a DataFrame that looks like this: import pandas as pd df = pd.DataFrame({'A':['a','b','c','d'],'B':['a','b','c','x'],'C':['y','b','c','d']}) df A B C 0 a a y 1 b b b 2 c c c 3 d x d I want to identify the most common…
iayork
  • 6,420
  • 8
  • 44
  • 49
3
votes
2 answers

What will happen to replicated but uncommited logs in raft protocol

Suppose a 3-member raft cluster a[master],b,c Client sends the log to a, a replicates it to b and c, a apply the log to the status machine and response to client. Then a crashes before b and c have a chance to replicate the committed state to b…
bigwesthorse
  • 225
  • 2
  • 8
3
votes
2 answers

Paxos and Discovery

Suppose I throw some machines in an elastic cluster and want to run some consensus algorithm in they (say, Paxos). Suppose they know the initial size of the network, say, 8 machines. So, they'll run a consensus algorithm, and the quorum is 5. Now,…
Luís Guilherme
  • 2,620
  • 6
  • 26
  • 41
3
votes
1 answer

Implementing a consensus protocol using a FIFO queue and peek() method

I need to implement a consensus protocol that makes use of a queue with a peek() method in order to show that a a consensus can be reached for any number of threads, i.e the queue with a peek() method has an infinite consensus number This is my…
Keaton Pennells
  • 189
  • 2
  • 4
  • 15
3
votes
2 answers

Cassandra's lightweight transactions & Paxos consensus algorithm

I have a very particular question regarding Paxos algorithm, which is implemented in Cassandra's lightweight transactions: What happens if two nodes issue the same proposal at the same time? Do them both get '[applied]: true' ? For example, consider…
AlonL
  • 6,100
  • 3
  • 33
  • 32
2
votes
1 answer

Will split the block into chunks with erasure coding increasing the network throughput performance

Assuming we have a network with n nodes and there is a coordinator elected that sends commands to nodes. Let's further assume that the coordinator has horrible bandwidth(upload speed) and he wants to send a large file 10 GB in nodes in o(n)…
2
votes
2 answers

Consensus algorithm check list

I wrote a new consensus algorithm. Is there a self-evaluation checklist I can run to see if it meets the basic requirements? Like is it resistant to double-spent attacks? Or how does it scales?
Ilya Gazman
  • 31,250
  • 24
  • 137
  • 216
2
votes
1 answer

How does Raft guarantee that a leader can always be elected?

The Raft paper says: Raft uses the voting process to prevent a candidate from winning an election unless its log contains all committed entries. A candidate must contact a majority of the cluster in order to be elected, which means that every…