0

I currently have the following data in my Kafka topic:

{"tran_slip":"00002060","tran_amount":"111.22"}
{"tran_slip":"00000005","tran_amount":"123"}
{"tran_slip":"00000006","tran_amount":"123"}
{"tran_slip":"00000007","tran_amount":"123"}

Since the data in my Kafka topic does not have a schema, I figured I can force a schema using AWS Glue Schema Registry.

So I created an Avro Schema in the following manner:

{
  "type": "record",
  "namespace": "int_trans",
  "name": "transaction",
  "fields": [
    {
      "name": "tran_slip",
      "type": "string"
    },
    {
      "name": "tran_amount",
      "type": "string"
    }
  ]
}

Now I created a confluent sink connector on MSK Kafka Connect to sink data from a Kafka topi back to an Oracle DB with the below properties:

connector.class=io.confluent.connect.jdbc.JdbcSinkConnector
value.converter.schemaAutoRegistrationEnabled=true
connection.password=******
transforms.extractKeyFromStruct.type=org.apache.kafka.connect.transforms.ExtractField$Key
tasks.max=1
key.converter.region=*******
transforms=RenameField
key.converter.schemaName=KeySchema
value.converter.avroRecordType=GENERIC_RECORD
internal.key.converter.schemas.enable=false
value.converter.schemaName=ValueSchema
auto.evolve=false
transforms.RenameField.type=org.apache.kafka.connect.transforms.ReplaceField$Value
key.converter.avroRecordType=GENERIC_RECORD
value.converter=com.amazonaws.services.schemaregistry.kafkaconnect.AWSKafkaAvroConverter
insert.mode=upsert
key.converter=org.apache.kafka.connect.storage.StringConverter
transforms.RenameField.renames=tran_slip:TRAN_SLIP, tran_amount:TRAN_AMOUNT
table.name.format=abc.transactions_sink
topics=aws-db.abc.transactions
batch.size=1
value.converter.registry.name=registry_transactions
value.converter.region=*****
key.converter.registry.name=registry_transactions
key.converter.schemas.enable=false
internal.key.converter=com.amazonaws.services.schemaregistry.kafkaconnect.AWSKafkaAvroConverter
delete.enabled=false
key.converter.schemaAutoRegistrationEnabled=true
connection.user=*******
internal.value.converter.schemas.enable=false
value.converter.schemas.enable=true
internal.value.converter=com.amazonaws.services.schemaregistry.kafkaconnect.AWSKafkaAvroConverter
auto.create=false
connection.url=*********
pk.mode=record_value
pk.fields=tran_slip

With these settings I keep getting the following error:

org.apache.kafka.connect.errors.DataException: Converting byte[] to Kafka Connect data failed due to serialization error:
    at com.amazonaws.services.schemaregistry.kafkaconnect.AWSKafkaAvroConverter.toConnectData(AWSKafkaAvroConverter.java:118)
    at org.apache.kafka.connect.storage.Converter.toConnectData(Converter.java:87)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.convertValue(WorkerSinkTask.java:545)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.lambda$convertAndTransformRecord$1(WorkerSinkTask.java:501)
    at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndRetry(RetryWithToleranceOperator.java:156)
    at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:190)
    at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execute(RetryWithToleranceOperator.java:132)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.convertAndTransformRecord(WorkerSinkTask.java:501)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.convertMessages(WorkerSinkTask.java:478)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:328)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:232)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:201)
    at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:189)
    at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:238)
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: com.amazonaws.services.schemaregistry.exception.AWSSchemaRegistryException: Didn't find secondary deserializer.
    at com.amazonaws.services.schemaregistry.deserializers.SecondaryDeserializer.deserialize(SecondaryDeserializer.java:65)
    at com.amazonaws.services.schemaregistry.deserializers.avro.AWSKafkaAvroDeserializer.deserializeByHeaderVersionByte(AWSKafkaAvroDeserializer.java:150)
    at com.amazonaws.services.schemaregistry.deserializers.avro.AWSKafkaAvroDeserializer.deserialize(AWSKafkaAvroDeserializer.java:114)
    at com.amazonaws.services.schemaregistry.kafkaconnect.AWSKafkaAvroConverter.toConnectData(AWSKafkaAvroConverter.java:116)

Can someone please guide me on what I am doing wrong in the configurations since I'm rather new to this topic?

  • Any other details in the log? – w08r Sep 06 '22 at 10:55
  • I updated the log trace with more details – Stephanie Meilak Sep 06 '22 at 14:23
  • Error suggests you're missing some configuration property on the converters for a "secondary deserializer". You should also remove "internal" converter properties since those are deprecated, are only used by the Connect worker, and should not be changed from JSON – OneCricketeer Sep 06 '22 at 14:52
  • Thanks for your suggestion, I will remove the 'internal' converter properties. I wasn't giving importance to the secondary deserializer error, because this thread suggests that it is not the main issue https://github.com/awslabs/aws-glue-schema-registry/issues/136 . I am not finding information about how to set up a secondary deserializer. – Stephanie Meilak Sep 06 '22 at 15:42

0 Answers0