0

if we are using schema registry in Kafka, is it required for every producer to send current version of Kafka every time it sends the record to broker?

if yes, what is the meaning of this extra overhead because we were already sending schema in every avro file?

and if no, please regret silliness of my question and please help me understanding schema registry better.

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
amrit sharma
  • 197
  • 1
  • 2
  • 11

1 Answers1

0

is it required for every producer to send current version of Kafka every time it sends the record to broker

Assuming you mean the version of the Avro Schema, then no, the Serializer and Registry handle that behind the scene. The schema itself is converted to JSON and posted to the registry, where it is hashed, stored, then an incremental ID is returned.

After the serializer gets this ID, the remaining byte array of the Avro message is sent to Kafka.

The consumer deserializer must read this ID, lookup the registry, then read the Avro bytes using the schema returned by the registry. You can override this behavior by storing a schema along with the consumer (similar to how you would need to do with Protobuf or JSON)

what is the meaning of this extra overhead because we were already sending schema in every avro file?

The Confluent Serializers do not include the schema within the Kafka message, only a 4 bit integer id that can be found at GET /schemas/ids/:id for any given ID in the registry

does using the registry make a significant difference

That's a loaded question ;) Compared to sending random strings of values to your topic, I think so. The registry by default enforces that all schemas within a topic can be read using new consumers via compatibility checks.

If you use JSON or Strings, then someone could send {"hello" : "world"} followed by the number 2, and your consumer would immediately break if it expected a JSON object

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245