MSK S3 sink not working with Field Partitioner

Question

I am using AWS MSK and msk connect. S3 sink connector is not working properly when I added io.confluent.connect.storage.partitioner.FieldPartitioner Note:without fieldPartitioner s3sink had worked. Other than this stack overflow Question Link I was not able to find any resource

Error

ERROR [FieldPart-sink|task-0] Value is not Struct type. (io.confluent.connect.storage.partitioner.FieldPartitioner:81)

Caused by: io.confluent.connect.storage.errors.PartitionException: Error encoding partition.

ERROR [Sink-FieldPartition|task-0] WorkerSinkTask{id=Sink-FieldPartition-0} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted. Error: Error encoding partition. (org.apache.kafka.connect.runtime.WorkerSinkTask:612)

MSK Connect Config

connector.class=io.confluent.connect.s3.S3SinkConnector
format.class=io.confluent.connect.s3.format.avro.AvroFormat
flush.size=1
schema.compatibility=BACKWARD
tasks.max=2
topics=MSKTutorialTopic
storage.class=io.confluent.connect.s3.storage.S3Storage
topics.dir=mskTrials
s3.bucket.name=clickstream
s3.region=us-east-1

partitioner.class=io.confluent.connect.storage.partitioner.FieldPartitioner
partition.field.name=name

value.converter.schemaAutoRegistrationEnabled=true
value.converter.registry.name=datalake-schema-registry
value.convertor.schemaName=MSKTutorialTopic-value
value.converter.avroRecordType=GENERIC_RECORD
value.converter.region=us-east-1
value.converter.schemas.enable=true
value.converter=org.apache.kafka.connect.storage.StringConverter

key.converter=org.apache.kafka.connect.storage.StringConverter

Data Schema which is stored in glue schema registry

{
  "namespace": "example.avro",
  "type": "record",
  "name": "UserData",
  "fields": [
    {
      "name": "name",
      "type": "string"
    },
    {
      "name": "favorite_number",
      "type": [
        "int",
        "null"
      ]
    },
    {
      "name": "favourite_color",
      "type": [
        "string",
        "null"
      ]
    }
  ]
}

score 0 · Answer 1 · answered Oct 18 '22 at 03:10

In order to partition by fields, your data needs actual fields.

StringConverter cannot parse data it consumes to add said fields. Use AvroConverter if you have Avro data in the topic. Also, Avro always has a schema, so remove the schemas.enable configuration

MSK S3 sink not working with Field Partitioner

1 Answers1