I have two applications, a node app, using @kafkajs/confluent-schema-registry library and a java app using standard KafkaProtobufSerializer.
Topic is schema bound with protobuf.
When I analyze the contents of the topic when the two apps serialize the same object in the KafkaUI application, they are not the same (and thus, KafkaUI cannot read the value from the node app with the SchemaRegistry value Serde).
Confluent tells us that:
1st byte is magic byte: 0 2-5 bytes are occupied by the registry ID (in this case 5). 6th onwards are the encoded object.
However: protobuf encoded object has following bytes: 08 4d 12 04 70 65 74 65
Java value using KafkaProtobufSerializer: 00 00 00 00 05 00 08 4d 12 04 70 65 74 65
Node value using confluent-schema-registry 00 00 00 00 05 08 4d 12 04 70 65 74 65
Where did this additional byte come from in the java app, and more pertinently, how is it derived? I expect both applications to produce identical byte arrays for an object with the same type (created from the same .proto file) and same values for its properties. The difference is a problem as it means I cannot consume this data with my java consumer.
Example of a different type (but with registry ID 4):
Java: 00 00 00 00 04 02 04 0a 26... (rest of object)
Node: 00 00 00 00 04 0a 26... (rest of object)
Where did these 2 additional bytes come from?! According to confluent, they shouldn't be there, but by all accounts, if we trust the java libraries they are the correct version!
Furthermore I have downloaded the protoscope application and passed the assumed payload into it, it will not work if I pass the java value from byte 6 onwards - it only works from byte 8, which demonstrates they are not coming from the protobuf serialization, and again begs the question: what are these bytes for, and why do they seem to matter to java?