1

I am aware that as long as there is quorum my zookeeper ensemble will keep working. But does the missing server have any notable impact on the cluster performance?

Let's suppose 1 of 3 servers crashes (and gets hdd destroyed). I guess I can join a new (clean) server without downtime as long as it has the same server-id as the old one and the other two servers can connect to it (it has same IP / hostname as old one in their configs)?

What is the impact of the third server "resyncing" (i.e., will it affect the speed on which consensus for new stuff is reached)? How long does that operation usually take (in relation to the amount of data in zookeeper)?

Can (or should) I just copy data and datalog from one of the existing servers? Snapshots are probably safe to be copied as-is, but transaction logs might need a "point-in-time copy" (I have btrfs CoW, so this is no problem)?

Or to be more specific I also wonder whether data on all nodes is equivalent (beside the latest writes) and interchangable. Or is somehow server-id specific stuff stored inside?

Dave M
  • 4,514
  • 22
  • 31
  • 30
fiction
  • 143
  • 1
  • 5
  • What kind of data you plan on keeping in Zoo? And how much data are we talking about? – Jakov Sosic Feb 28 '16 at 14:37
  • I am aware of the 1 MB limit per node. It is basically about storing configuration, but on some parts of the system it got abused a bit (and there are also a few blobs inside). (And we might also use it for master election.) Latest snapshot is 17 MB. – fiction Feb 28 '16 at 15:59
  • Note that beside the practical part of my question, I am also interested in theory. Perhaps I should read the whole [ZAB paper](http://web.stanford.edu/class/cs347/reading/zab.pdf), but I was interested in some less academic explanation of what happens when an amnesic follower arrives. Will it just download all data from the leader? Is that replication asynchronous? – fiction Feb 28 '16 at 16:11

1 Answers1

0

rejoining with the same server ID but no data will break the quorum. You need to first remove the old serverID fromm all remaining server and add a new server using a new ID.

skyde
  • 11
  • 5
  • Do you have a reference to the docs to confirm this? I have never heard of this limitation, but correct me if I am wrong. – SynthC Jan 02 '17 at 14:44
  • This is not explicitly written in the Zookeeper documentation. But the ZAB algorithm would break if a node rejoin with amnesia. Because it will be able to vote for a proposal that it should not be allowed to vote for. – skyde Jan 04 '17 at 00:02