10

Paxos algorithm can tolerate up to F failures when using 2F + 1 processors. As far as I understand, this algorithm works only with fixed number of processors. Is it possible to use this algorithm in dynamic environment, where nodes can be added and removed dynamicaly?

Evgeny Lazin
  • 9,193
  • 6
  • 47
  • 83

4 Answers4

5

Yes it is possible, there are even some papers on it. From what I remember I read a bit on how to do it was described here http://research.microsoft.com/pubs/64634/web-dsn-submission.pdf Hope that's what you were asking about. Look for "dynamic paxos".

Mateusz Dymczyk
  • 14,969
  • 10
  • 59
  • 94
  • Dynamic paxos is kinda scarry :) – Evgeny Lazin Aug 24 '11 at 05:28
  • @Lazin actually not! I use dynamic Paxos and it has been extremely stable. The idea is that you have two state machines. The first is what you typically think about: the state Paxos is to keep in sync. The other state machine is the membership list of the nodes. Any instance of Paxos must use a snapshot of the membership state machine. – Michael Deardeuff Apr 14 '12 at 06:58
  • @MichaelDeardeuff Dynamic Paxos becomes a bit scary when you introduce Multi-paxos. It's fairly trivial when you use it with single instance Paxos. – Jon Bringhurst Jun 08 '13 at 21:00
  • @JonBringhurst Again, I disagree: State-machines take all the scariness away. The configuration state-machine moves in tandem with the application state-machine. The configuration state tells exactly which acceptors are in the next paxos instance. It doesn't matter if it is basic paxos, mult-paxos, or fast-cheap-multi-paxos. – Michael Deardeuff Jun 09 '13 at 07:19
  • I am struggling with the alpha in dynamic Paoxs. It seems to be some arbitrary distance in instance numbers where you know that every node has leant the new cluster state. Can someone elaborate on how it is determined or works in practice? – simbo1905 Mar 19 '15 at 21:10
  • Aha I think I see it now. It's basically just some delay until the instance which changes the cluster membership is chosen. That's not fixed it's just whenever it happens. At that point it's fixed and any new leader in a crash will recover that value. So it's actually very simple as @Michael was saying. – simbo1905 Mar 19 '15 at 21:17
3

The Stoppable Paxos paper is a bit easier to understand and permits safe reconfiguration (addition and subtraction of nodes): http://research.microsoft.com/apps/pubs/default.aspx?id=101826

Robert Newson
  • 4,631
  • 20
  • 18
1

If you have an absolute maximum number of nodes then it should still work. But you'd be left with a situation where your dynamic node count is 6 your maximum is 11, so if 1 node fails you're out of luck (the non-existent nodes are fails by default). If your removing and adding nodes you could restore the state of a node you removed to a node you add to avoid it being counted as a failure.

Louis Ricci
  • 20,804
  • 5
  • 48
  • 62
-1

Yes. Gryadka is a JavaScript Paxos implementation supporting dynamic reconfiguration in 500 lines. It is based on ideas from Vertical Paxos and Raft.

rystsov
  • 1,868
  • 14
  • 16
  • The belief that we need to extend Paxos for cluster membership is on very shaky ground. The Microsoft paper version of Dynamic Paxos as discussed in @Mateusz's answer below is sufficient. All practical implementations of Paxos I am aware of update distributed state in a consistent manner. That shared state can trivially be both the app state and the cluster membership. So rather than needing to read, comprehend and implement a new approach all practical implications can use "main stream Paxos" and "eat their own dog food" to handle cluster membership changes. Paxos was designed for this. – simbo1905 Jul 29 '15 at 07:31