3

Suppose I throw some machines in an elastic cluster and want to run some consensus algorithm in they (say, Paxos). Suppose they know the initial size of the network, say, 8 machines.

So, they'll run a consensus algorithm, and the quorum is 5.

Now, consider these cases:

  1. I see that CPU is too low, and I reduce the cluster size in half, to 4 machines.
  2. There is a partition split, and each split gets 4 machines.

If I take the current cluster size to get quorums, I'm subject to partition splits. Since for the underlying cluster, situations (1) and (2) look exactly the same. However, if I use a fixed number, I'm not able to scale down the cluster (and I'm subject to inconsistencies due to partition if I scale it up).

I have a third alternative, that of informing all the machines the size of the cluster when scaling, but there's a possibility of a partition happening right before a scale up, for instance, and that partition not receiving the new size and having enough quorum for a consensus using the old size.

Is Paxos (and any other safe consensus algorithms) unusable in an elastic environment?

Luís Guilherme
  • 2,620
  • 6
  • 26
  • 41
  • its a bad idea to have even numbers of acceptors split across two sites. a partitioned is highly likely. better to designate only some nodes as acceptors and other nodes only as learners as spare nodes at each site. ensure an odd number of acceptors at any one time with a majority in your 'preferred' data centre. the system can self reconfigure; if an accepter goes offline for a long time the system can automatically run paxos rounds to change the cluster configuration and demote the crashes accepter to a learner and promote one spare learner at the same site to be an acceptor. – simbo1905 Jan 21 '16 at 13:49

2 Answers2

2

Quorum-based consensus protocols fundamentally require quorums in order to operate. Both Multi-Paxos and Raft can be used in environments with dynamically changing cluster and quorum sizes but it must be done in a controlled manner that always maintains a consistent quorum. If, for example, you were currently using a cluster size of 8 and wanted to reduce that cluster to a size of 4. You could do so. However, that decision to reduce the cluster size to 4 must be a consensual decision that's agreed upon by the original 8.

Your question is a little unclear but it sounded like you were asking if you could safely reduce your cluster size to 4 as a recovery mechanism in the event that some kind of network partition renders your original cluster of 8 inoperable. The answer to that is effectively no since the decision to do so could not be consensual and attempting to go behind the back of the consensus algorithm is virtually guaranteed to result in inconsistencies. How would the new set of 4 be defined? How would you guarantee that all peers reached the same conclusion? How do you ensure they all make the same decision at the same time?

You could, of course, make all of these decisions manually and force the system to recover by shutting the consensus service down on each system and reconfiguring their quorum definition by hand. Assuming you don't screw up (which is an overwhelmingly big assumption for any real-world deployment) this would be safe. A better approach though would be to design the system such that one or two network partitions either won't halt the system (lots of sites) or use an eventual consistency model that gracefully handles the occasional network partitions. There's no magic bullet for getting around CAP restrictions.

Rakis
  • 7,779
  • 24
  • 25
2

Paxos and friends can scale in an elastic way (somewhat). Instead of changing the quorum size, though, just add learners. Learners are nodes that don't participate in consensus, but get all the decisions. Just like acceptors, learners accept reads and forward writes to the leader.

There are two styles of learner. The first listens to all events from acceptors; in the second the leader forwards all committed transitions to the learners

Michael Deardeuff
  • 10,386
  • 5
  • 51
  • 74