I'm trying to create a Kafka-connect connector to sink from an AVRO Topic to a file.
And then restore this file to another topic using kafka-connect.
The sink is working fine, I could see the sink file growing and read the data. But when I try to restore to a new topic, the new topic stays with no data..
And I get no errors, I already reset the offset, I create a new kafka-connect and tried to restore, I create a full new Kafka cluster and always the same, no error on the source connector, but the topic is empty.
Here the output of the source connector config:
{
"name": "restored-exchange-rate-log",
"config": {
"connector.class": "org.apache.kafka.connect.file.FileStreamSourceConnector",
"value.converter.schema.registry.url": "http://kafka-schema:8881",
"file": "/tmp/exchange-rate-log.sink.txt",
"format.include.keys": "true",
"source.auto.offset.reset": "earliest",
"tasks.max": "1",
"value.converter.schemas.enable": "true",
"name": "restored-exchange-rate-log",
"topic": "restored-exchange-rate-log",
"value.converter": "io.confluent.connect.avro.AvroConverter",
"key.converter": "org.apache.kafka.connect.storage.StringConverter"
},
"tasks": [
{
"connector": "restored-exchange-rate-log",
"task": 0
}
],
"type": "source"
}
And here the output of the source connecotor status:
{
"name": "restored-exchange-rate-log",
"connector": {
"state": "RUNNING",
"worker_id": "kafka-connect:8883"
},
"tasks": [
{
"state": "RUNNING",
"id": 0,
"worker_id": "kafka-connect:8883"
}
],
"type": "source"
}
Here the output of the sink connector config:
{
"name": "bkp-exchange-rate-log",
"config": {
"connector.class": "org.apache.kafka.connect.file.FileStreamSinkConnector",
"source.auto.offset.reset": "earliest",
"tasks.max": "1",
"topics": "exchange-rate-log",
"value.converter.value.subject.name.strategy": "io.confluent.kafka.serializers.subject.RecordNameStrategy",
"value.converter.schema.registry.url": "http://kafka-schema:8881",
"file": "/tmp/exchange-rate-log.sink.txt",
"format.include.keys": "true",
"value.converter.schemas.enable": "true",
"name": "bkp-exchange-rate-log",
"value.converter": "io.confluent.connect.avro.AvroConverter",
"key.converter": "org.apache.kafka.connect.storage.StringConverter"
},
"tasks": [
{
"connector": "bkp-exchange-rate-log",
"task": 0
}
],
"type": "sink"
}
Here the output of the sink connector status:
{
"name": "bkp-exchange-rate-log",
"connector": {
"state": "RUNNING",
"worker_id": "kafka-connect:8883"
},
"tasks": [
{
"state": "RUNNING",
"id": 0,
"worker_id": "kafka-connect:8883"
}
],
"type": "sink"
}
The sink file is working, always growing, but the topic restored-exchange-rate-log is totally empty.
Adding more details.
I have tried now to do the "Zalando" way, but we don't use the s3, we are using the FileStream connector.
Here the Sink:
{
"connector.class": "org.apache.kafka.connect.file.FileStreamSinkConnector",
"file": "/tmp/exchange-rate-log.bin",
"format.include.keys": "true",
"tasks.max": "1",
"topics": "exchange-rate-log",
"format": "binary",
"value.converter": "com.spredfast.kafka.connect.s3.AlreadyBytesConverter",
"key.converter": "com.spredfast.kafka.connect.s3.AlreadyBytesConverter",
"name": "bkp-exchange-rate-log"
}
Here the Source:
{
"connector.class": "org.apache.kafka.connect.file.FileStreamSourceConnector",
"file": "/tmp/exchange-rate-log.bin",
"format.include.keys": "true",
"tasks.max": "1",
"format": "binary",
"topic": "bin-test-exchange-rate-log",
"value.converter": "com.spredfast.kafka.connect.s3.AlreadyBytesConverter",
"key.converter": "com.spredfast.kafka.connect.s3.AlreadyBytesConverter",
"name": "restore-exchange-rate-log"
}
The sink connector looks fine, the sink generated this file /tmp/exchange-rate-log.bin and is increasing, but the Source (Restore) is getting an error:
Caused by: org.apache.kafka.connect.errors.DataException: bin-test-exchange-rate-log error: Not a byte array! [B@761db301
at com.spredfast.kafka.connect.s3.AlreadyBytesConverter.fromConnectData(AlreadyBytesConverter.java:22)
at org.apache.kafka.connect.runtime.WorkerSourceTask.lambda$convertTransformedRecord$2(WorkerSourceTask.java:269)
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndRetry(RetryWithToleranceOperator.java:128)
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:162)
... 11 more