0

I am trying to understand how globally unique UUIDs are generated for schemas in schema registry but fail to understand the following text present on this page.

Schema ID allocation always happen in the master node and they ensure that the Schema IDs are monotonically increasing.

If you are using Kafka master election, the Schema ID is always based off the last ID that was written to Kafka store. During a master re-election, batch allocation happens only after the new master has caught up with all the records in the store .

If you are using ZooKeeper master election, {schema.registry.zk.namespace}/schema_id_counter path stores the upper bound on the current ID batch, and new batch allocation is triggered by both master election and exhaustion of the current batch. This batch allocation helps guard against potential zombie-master scenarios, (for example, if the previous master had a GC pause that lasted longer than the ZooKeeper timeout, triggering master reelection).

Question:

  • When using zookeeper for master election, what is the need to store the current batch id in zookeeper unlike the kafka master election?
  • Can someone explain in detail how batch allocation when using zookeeper election works? Specifically, I don't understand the following:

new batch allocation is triggered by both master election and exhaustion of the current batch. This batch allocation helps guard against potential zombie-master scenarios, (for example, if the previous master had a GC pause that lasted longer than the ZooKeeper timeout, triggering master reelection).

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
Viraj
  • 777
  • 1
  • 13
  • 32
  • Zookeeper just acts like a KV store, and holds a monotonically increasing number. Where else would you store it so that multiple machines can access it? – OneCricketeer Sep 28 '18 at 00:01
  • When using kafka based master election zookeeper is not used so it doesn't look like a hard requirement. If I understand it right, with kafka based master election, the new master would replay the write ahead log and get the most recent schema id. This system works because schema registry uses a single master architecture. – Viraj Sep 28 '18 at 04:54
  • But you're not asking about Kafka election, which was only added in 4.x release – OneCricketeer Sep 28 '18 at 14:43
  • Okay. Makes sense then. – Viraj Sep 28 '18 at 16:30

0 Answers0