0

We are sending table data from db2 to iidr-cdc to kafka . We have a trouble with format of data in Kafka topic when you see the messages in kafka-avro-console-consumer .

For Db2 columns defined as DEFAULT NULL if their value is null, it looks fine in kafka topic (as key:value) . BUT when the value is not null , it is wrapped in a dictionary .

Example Output if column is -

"Random_key": {
    "int": 9088245671
  }

Here, the key of that entry is the datatype of the column and the value is column value. --> This kind of output format is undesirable for our application

If the value is actually null and column defined as DEFAULT NULL , it looks fine . Just as expected -

 "Random_key": null 

How can we make the changes either in IIDR CDC or KAFKA side to always display the message in key:value format , like this - (even if DEFAULT NULL column contains some value in column)

"Random_key": 9088245671

Thanks!

HelloWorld
  • 290
  • 3
  • 14
Tony
  • 671
  • 1
  • 9
  • 29

1 Answers1

1

It's normal, it means that the field Random_key is an avro record of type Union. With an union type you have to set a default value that match the type of the union and in your case your CDC is interpreted the database field schema constraint as an union { null, int}.

When the field is not null, it means that it's an integer and in avro when it's an union you have to specify what is the according type. Imagine if you have this : union {string, int, double}. Here the field is correct when it's a string, an integer or a double, but we want to know for each field what is the real type of that data.

Unfortunately it's the correct behavior but normally you don't care about that. avro-console-consumer use a json serializer to print the data for you to be able to read it. In your code the field data type will be correctly interpreted like you want them to be.

EDIT : If you business need absolutely a record in json format, there is a guy that wanted to change the representation in more readeable json and developed a set of encoder/decoder, to use instead of the default :

https://github.com/zolyfarkas/avro/commit/8926d6e9384eb3e7d95f05a9d1653ba9348f1966

Saïd Bouras
  • 286
  • 1
  • 4
  • Let me ask you what is your application? Kafka streams? Kafka clients api (producer/consumer)? – Saïd Bouras Dec 12 '18 at 21:03
  • Kafka Clients API . As of now we are testing the data in topic and found the weird columns are causing only in columns defined as "default null" – Tony Dec 12 '18 at 21:35
  • 1
    Okay, normally if you work with clients API or any others APIs with java/scala the type of your data will be interpreted correctly (any non primitive type in java can be null) so you will not have issues due to this. – Saïd Bouras Dec 12 '18 at 21:38
  • 1
    Note that it's just a representation of the avro record in json! – Saïd Bouras Dec 12 '18 at 21:39
  • I just edited my answer, if you need to send avro record in json format but I insist that if you don't need to send message in json, there is no problem this is just a display format in the console. – Saïd Bouras Dec 12 '18 at 21:56