I'm consuming Avro serialized messages from Kafka using the "automatic" deserializer like:
props.put(
ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,
"io.confluent.kafka.serializers.KafkaAvroDeserializer"
);
props.put("schema.registry.url", "https://example.com");
This works brilliantly, and is right out of the docs at https://docs.confluent.io/current/schema-registry/serializer-formatter.html#serializer.
The problem I'm facing is that I actually just want to forward these messages, but to do the routing I need some metadata from inside. Some technical constraints mean that I can't feasibly compile-in generated class files to use the KafkaAvroDeserializerConfig.SPECIFIC_AVRO_READER_CONFIG => true
, so I am using a regular decoder without being tied into Kafka, specifically just reading the bytes as a Array[Byte]
and passing them to a manually constructed deserializer:
var maxSchemasToCache = 1000;
var schemaRegistryURL = "https://example.com/"
var specificDeserializerProps = Map(
"schema.registry.url"
-> schemaRegistryURL,
KafkaAvroDeserializerConfig.SPECIFIC_AVRO_READER_CONFIG
-> "false"
);
var client = new CachedSchemaRegistryClient(
schemaRegistryURL,
maxSchemasToCache
);
var deserializer = new KafkaAvroDeserializer(
client,
specificDeserializerProps.asJava
);
The messages are a "container" type, with the really interesting part one of about ~25 types in a union { A, B, C } msg
record field:
record Event {
timestamp_ms created_at;
union {
Online,
Offline,
Available,
Unavailable,
...
...Failed,
...Updated
} msg;
}
So I'm successfully reading a Array[Byte]
into record
and feeding it into the deserializer like this:
var genericRecord = deserializer.deserialize(topic, consumerRecord.value())
.asInstanceOf[GenericRecord];
var schema = genericRecord.getSchema();
var msgSchema = schema.getField("msg").schema();
The problem however is that I can find no to discern, discriminate or "resolve" the "type" of the msg
field through the union:
System.out.printf(
"msg.schema = %s msg.schema.getType = %s\n",
msgSchema.getFullName(),
msgSchema.getType().name());
=> msg.schema = union msg.schema.getType = union
How to discriminate types in this scenario? The confluent registry knows, these things have names, they have "types", even if I'm treating them as GenericRecords
,
My goal here is to know that record.msg
is of "type" Online | Offline | Available
rather than just knowing it's a union
.