We are using Confluent Kafka with schema registry. We have more than 40 topics. Our application writes avro messages to these topics, using the schema from the registry.
From what I understand, when using the registry, the message doesn't contain the actual schema but a reference to the schema ( schema id ) from the registry. I am working on a utility in Java, that will accept a topic name or a list of topic names and retrieve a limited number ( maybe 50 ) of messages, from each of these topics. This utility should then convert the avro message to a json.
All the examples that I have seen, still have to give the schema for being able to convert the byte[] to json.
I was hoping and wondering, if it is possible to get the schema from the registry dynamically using the information from the avro message ( the schema id ) from the topic and use that to convert the message to json.
Is it possible to do so? Can someone give me an example on how to achieve that?
Thank you
Updates
@eik
Trial 1
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,
"io.confluent.kafka.streams.serdes.avro.GenericAvroDeserializer");
final Consumer<String, GenericRecord> genericConsumer = new KafkaConsumer<>(props);
genericConsumer.subscribe(Collections.singletonList("TOPICNAME"));
while (true) {
final ConsumerRecords<String, GenericRecord> genericConsumerRecords = genericConsumer.poll(Duration.ofMillis(1000));
System.out.println("genericConsumerRecords.count() : " + genericConsumerRecords.count() + " genericConsumerRecords.isEmpty() : " + genericConsumerRecords.isEmpty());
genericConsumerRecords.forEach(genericRecord1 -> {
try {
System.out.println("convert(genericRecord1.value()) -> " + convert(genericRecord1.value()));
} catch (IOException e) {
e.printStackTrace();
}
//
});
}
This is the output genericConsumerRecords.count() : 0 genericConsumerRecords.isEmpty() : true
Note: : The convert method is the one in the answer below
Trial 2
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,
"org.apache.kafka.common.serialization.ByteArrayDeserializer");
final Consumer<String, byte[]> consumer = new KafkaConsumer<>(props);
consumer.subscribe(Collections.singletonList("TOPICNAME"));
while (true) {
final ConsumerRecords<String, byte[]> consumerRecords = consumer.poll(Duration.ofMillis(1000));
System.out.println("consumerRecords.count() : " + consumerRecords.count() + " consumerRecords.isEmpty() : " + consumerRecords.isEmpty());
consumerRecords.forEach(record1 -> {
String string = new String(record1.value(), StandardCharsets.UTF_8);
System.out.printf("offset = %d, key = %s, value = %s \n", record1.offset(), record1.key(), string);
});
}
This is the output
consumerRecords.count() : 60 consumerRecords.isEmpty() : false
offset = 0, key = e3bff195-08a7-4c58-99de-98ffe2d460e6, value = He52d6fa6-841f-430c-8bf7-bd4c7b684129 http://schemaregistryurl:8081/subjects/TOPICNAME-value/versions/1/schema Canon Message to represent CustomerPrefAVRFAST 162019-08-07T08:35:35.9950728 QA1-Test-0421-16$CustomerPrefData 1He52d6fa6-841f-430c-8bf7-bd4c7b684129 RawH862437d0-e260-45f9-ab5e-345b536d685a02020-04-21T17:48:52.601Z$CustomerPref POL_MAST02020-04-21T11:17:28.241ZHe3bff195-08a7-4c58-99de-98ffe2d460e69
False&1900-01-01T00:00:00He3bff195-08a7-4c58-99de-98ffe2d460e6He3bff195-08a7-4c58-99de-98ffe2d460e6
Note: : I had to remove some non-ascii characters from the output
The second method does give the output, but it is a byte[], need to get the json output. Have tried different ways unsuccessfully.
How do I fix it?
Thanks