2

I have a Kafka-Connect job configured to query a MySQL table periodically and place messages on a queue. The structure of these messages are defined using an Avro schema. I am having an issue with the mapping for one of my columns.

The column is defined as a tinyint(1) in my MySQL schema, and I am trying to map this to a boolean field in my avro object.

enter image description here

{ "name": "is_active", "type": "boolean" }

The kafka-connect jobs runs, and messages are placed on the queue, but when my application who reads from the queue attempts to deserialize the messages I get the following error:

org.apache.avro.AvroTypeException: Found int, expecting boolean

I was hoping that a 1 or 0 value could be automatically mapped to a boolean, but that does not seem to be the case.

I have also tried to configure my job to use a 'Cast' transform, but that just seems to caused issues with the other fields in the message.

"transforms": "Cast", "transforms.Cast.type": "org.apache.kafka.connect.transforms.Cast$Value", "transforms.Cast.spec": "is_active:boolean"

Is what I am attempting possible, or will I have to change my application to work with the int value?

Here is my full configuration ( I have stripped out some other irrelevant fields )

Kafka Connect job config

{ "connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector", "mode": "bulk", "topic.prefix": "my_topic-name", "transforms.SetSchemaMetadata.type": "org.apache.kafka.connect.transforms.SetSchemaMetadata$Value", "query": "select is_active from my_table", "poll.interval.ms": "30000", "transforms": "SetSchemaMetadata", "name": "job_name", "connection.url": "connectiondetailshere", "transforms.SetSchemaMetadata.schema.name": "com.my.model.name" }

AVRO Schema

{ "type": "record", "name": "name", "namespace": "com.my.model", "fields": [
{ "name": "is_active", "type": "long" } ], "connect.name": "com.my.model.name" }

Robin Moffatt
  • 30,382
  • 3
  • 65
  • 92
Paddyd
  • 1,870
  • 16
  • 26
  • what's the error with casting? – Holm Jul 02 '19 at 11:59
  • I haven't got the exact error right now, but it was causing issues with the types of other fields in my schema which seemed to be getting cast even though I didn't specify them inside "transforms.Cast.spec" – Paddyd Jul 02 '19 at 12:22
  • sounds for me reserved word issues. do you have `query` as field name? then put `` to cover like \`query\`. also if you have a big ksql, it would be simple to remove some parts of ksql and test with small parts to know which part works, like database developer debugs the query – Holm Jul 02 '19 at 12:55
  • No I don't have query as a field name. The is_active field is a new which I am adding, so I know everything else works. The issue is specific to this new field – Paddyd Jul 02 '19 at 13:26

1 Answers1

1

You can do this either with a custom Transform (this is a perfect use case for it), or write a simple streaming application to do it, for example in KSQL:

CREATE STREAM my_topic AS 
  SELECT COL1, COL2, …
         CASE WHEN is_active=1 THEN TRUE ELSE FALSE END AS is_active_bln
  FROM my_source_connect_topic;
ksql> describe my_topic;

Name                 : my_topic
 Field         | Type
-----------------------------------------
 ROWTIME       | BIGINT           (system)
 ROWKEY        | VARCHAR(STRING)  (system)
 COL1          | INTEGER
 COL1          | VARCHAR
 IS_ACTIVE_BLN | BOOLEAN
----------------------------------------
Robin Moffatt
  • 30,382
  • 3
  • 65
  • 92
  • from the document I understand the same that the transformation could be done via Cast, or what would make it does not work? – Holm Jul 03 '19 at 08:28