8

I am trying to understand exactly what types of problems Apache ZooKeeper ("ZK") solves, and perhaps their Recipes page is the best place to start.

First off, I am making the following assumptions:

  • The ZooKeeper API (available in both Java and C) exposes these 7 simple methods which then allow you to build up your own usage patterns, known as "ZK Recipes"
  • It is then up to you to use these ZK Recipes to solve problems in distributed programming yourself
  • Or, instead of building up your own ZK Recipes, you could just use the ones that ship with Apache Curator
  • So either way, you're using ZK Recipes (again, homegrown or provided by Curator) to solve distributed computing problems

I believe Apache Kafka is an example of this, where Kafka uses ZK to create a distributed Queue (which is one of the listed ZK Recipes). So if my assumptions are correct, ZK exposes those API methods, and the creators of Apache Kafka either used ZK directly or used Curator to implement the "Queue" ZK Recipe.

If any of my above assumptions are wrong, please begin by correcting me! Assuming I'm more or less on track:

Looking at the list of ZK Recipes, I see the following (non-exhaustive):

  • Barriers
  • Locks
  • Leader Election

In order for me to appreciate these recipes and the solutions they present, I first need to appreciate the problem that they solve! I understand what a lock is from basic Java concurrency, but I'm just not seeing the use case for when a "distributed Lock" would ever be necessary. For leading election, all I can think of - as a use case for needing it in the first place - would be if you were building an application that you wanted to ship with a built-in master/slave or primary/secondary capability. Perhaps in that case you would use ZK to implement your own "Leader Election" recipe, or perhaps just use Curator's Leader Latch out of the box. As for Barriers, I don't see how those are any different than Locks. So I ask:

  • Is my master/slave or primary/secondary problem an accurate use case for ZK's Leader Election recipe?
  • What would be an example of a distributed Lock? What problem(s) does it solve?
  • Ditto for Barriers: and what's the difference between Locks and Barriers?
Martin Serrano
  • 3,727
  • 1
  • 35
  • 48
DirtyMikeAndTheBoys
  • 1,077
  • 3
  • 15
  • 29

1 Answers1

8
  1. Yes. Your Zk's leader election recipe example is a correct one. In general, if a recipe already exists why rewrite it?

Quoting Zookeeper documentation:

ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services.

  1. Regarding distributed locks - Let's say you have a distributed system where all configuration are saved on Zookeeper, and more than one entity is responsible for updating a certain configuration - In such a case you would want the configuration updates to be synchronous.

  2. Regarding the barrier, I personally never used them - but with a lock you need to aquire the lock to actually do something on the node, a barrier you wait until it's free but do not necessarily need to set the barrier once it's free.

Uri Shalit
  • 2,198
  • 2
  • 19
  • 29
  • 1
    thankyou - when you say "*Let's say you have a distributed system where all configurations are savved on ZooKeeper...*", I'm not understanding what you mean by "*configuration*". Do you mean "configuration files?" For instance, if `myapp.war` usually takes a `myapp.conf` config file at runtime, are you saying that ZK can be used to store `myapp.conf` for all nodes running `myapp.war`? Or do you mean something else? – DirtyMikeAndTheBoys Jun 17 '15 at 17:31
  • 2
    Instead of using myapp.conf you can use the ZK, or alternatively when your app starts save all myapp.conf configurations in ZK and then they are available for your entire system. This way you can also update them in run time. If a user of your app for instance can change configuration of the system, it can be saved on the ZK instead only in memory - which is persistant. – Uri Shalit Jun 17 '15 at 17:34
  • 1
    thanks again @Uri Shalit (+1 again) - I'm sorry but I have to ask one more followup here: how exactly would I save `myapp.conf` in ZK? Does ZK have a database that it uses (if so, what is the DB)? What API methods would I call to save/update the `myapp.conf` configurations? For instance, say one config is `appPort` which in `myapp.conf` is stored as `appPort = 9200`. What's an example API call I could make to update `appPort` to be, say, 9300? Thanks again! – DirtyMikeAndTheBoys Jun 17 '15 at 17:46
  • 1
    If you are using the `Curator`. You initialize a curator framework object, and then for instance create the path with 9200: `curator.getCuratorFramework().create().creatingParentsIfNeeded().forPath("appPort", Integer.toString(9200).getBytes());` and then when updating it: `curator.getCuratorFramework().setData().forPath("appPort", Integer.toString(9300).getBytes());` – Uri Shalit Jun 17 '15 at 17:51
  • That is very, very cool (green check). I'm wondering: is ZK a suitable replacement for static config management tools like Chef/Puppet/Ansible? It feels like this capability bypasses the need for such tools. – DirtyMikeAndTheBoys Jun 17 '15 at 18:01
  • To be honest, I am not familiar with these tools – Uri Shalit Jun 17 '15 at 19:50