In Avro IDL I have a Message record defined as follows:
record Message{
MessageId id;
array<string> dataField;
}
I am using this record in another record with a null union:
record Invoice{
...
union {null,array<Message>} message;
}
We have a Java Kafka consumer (we're using Confluent Platform) that is using the avro-maven-plugin
version 1.10.2, configured with <stringType>String</stringType>
When we are making a call such as this:
List<String> msgList = message.getDataField();
for (String msg : msgList) {...}
we receive the following error on the second line: class org.apache.avro.util.Utf8 cannot be cast to class java.lang.String
Previously, the Invoice object was defined as:
record Invoice{
...
array<Message> message;
}
and we did not receive this error. We have found that in our schema file, changing from
"name" : "dataField",
"type" : {
"type" : "array",
"items" : "string"
}
to
"name" : "dataField",
"type" : {
"type" : "array",
"items" :{
"type": "string",
"avro.java.string" : "String"
}
}
corrects the problem.
I'm unclear as to why adding the union caused this change in behavior. Should I declare all of the strings in the schema with the avro.java.string
and if so, how do I do that with Avro IDL?