1

I have a sort of unique configuration management problem. I don't quite think that zookeper is built to solve it, but I could be wrong.

The system will be provisioning configuration to multiple devices in a network. The configuration itself is comprised of tens of millions of configuration objects.

If a device hasn't been provisioned yet, it needs to read the current version of the entire config (tens of millions of objects).

Once a device has been provisioned, it needs to receive versioned changes to the config. Changes are only occuring the order of hundreds/sec, or low thousands/sec.

With zookeeper's document based model, and 1 MB response limitation, it doesn't quite seem right for this. Am I wrong?

strtok
  • 39
  • 4

1 Answers1

2

ZooKeeper still might not be a good fit for your application, but the 1 MB response limit is per operation. So, if all of your configuration objects are under 1 MB in size, you will not have problems reading/writing them.

For this scenario there are a couple of things to keep in mind: all data is stored in memory. It is made durable by logging to disk and reliable using replication, but if you run out of memory you are done. There is also a per znode memory overhead (on the order of 100 bytes). If you have 10,000,000 objects and each object only stores 100 bytes, your memory footprint is on the order of 2G. Assuming you have a relatively well provisioned server this shouldn't be too much of a problem. Also keep in mind that it will take longer to recover from a crash since you have to read in on the order of 2G of data during startup.

You also have a related problem on the client side: if the clients really are going to read all the configuration objects, they are pulling a lot of data over the network! Assuming each object is under 1 MB you will not run into any limits, but it will take a bit.

Benjamin Reed
  • 402
  • 1
  • 4
  • 11