0

Apart from using Confluent Schema Registry, is there a way (via the built-in CLI tools) to view the serialization format of a topic's key and value?

Alexander Popov
  • 23,073
  • 19
  • 91
  • 130
  • 1
    No, it's pretty much a contract between producer and consumer. – daniu Mar 02 '20 at 20:07
  • @daniu Could you elaborate a bit? If I try to produce json data to an Avro topic, I'd get an error. This means, that the serialization is set somewhere. Is this correct? – Alexander Popov Mar 02 '20 at 20:21
  • 1
    To Kafka, there is no "json topic" or a "avro topic". The error you get is (presumably) on the consumer side. If you do have a consumer, you can look at its configuration to see what deserializer it uses - that's what it expects to receive. – daniu Mar 02 '20 at 20:26
  • @daniu last question - assuming a consumer uses the correct serde, is it possible to publish two messages with different encoding in the same topic? I don't say that's a smart thing to do, I'm just asking, if it's technically possible. – Alexander Popov Mar 02 '20 at 20:42
  • Also, @daniu, why don't you publish this as an answer? – Alexander Popov Mar 02 '20 at 20:42
  • 2
    Yes, posting two different messages formats on the same topic is possible. I originally just wanted to write the one sentence which I didn't think was enough for an answer. Also I'm on mobile so that's always a hassle. I'll post one tomorrow when I'm back on apc. – daniu Mar 02 '20 at 21:06
  • Even using Schema Registry doesn't allow you to "view" such a thing – OneCricketeer Mar 03 '20 at 07:46

1 Answers1

4

Kafka (by which I mean the broker) has absolutely no idea what the "format" of anything is. To the broker everything is bytes.

The Kafka wire format also has no dedicated place to specify the encoding scheme - of either the keys or values, or the keying scheme, or the partitioning scheme, or anything really. Kafka records have headers, but just like payloads, the broker doesn't look at them.

The payload format (for both keys and values) is just an agreed-upon convention between producers and consumers. Some producers and consumers can be configured to operate vs an avro schema registry (like the confluent one) but there's absolutely nothing stopping someone from spinning up a <byte[], byte[]> producer and sending a cat photo to such a topic.

There's also nothing that says the payload needs to be a single (polymorphic) type - you can do whatever you want as long as all your producers and consumers "agree" on how to read/write data.

radai
  • 23,949
  • 10
  • 71
  • 115
  • I agree. Brokers deal in terms of bytes and byte transfers. The encoding format is important for producers and consumers. As per OP's question even through CLI '--from-beginning' we can see the message directly. We don't see key associated with it. – c0der512 Mar 03 '20 at 22:36