I am trying to use lenses s3 source connector in aws msk(kafka). Download kafka-connect-aws-s3-kafka-3-1-4.0.0.zip as a plug-in, save it in S3 and register it. And the above plug-in was specified and the connector configuration was written as follows.
<Connector Configuration>
connector.class=io.lenses.streamreactor.connect.aws.s3.source.S3SourceConnector
key.converter.schemas.enable=false
connect.s3.kcql=insert into my_topic select * from my_bucket:dev/domain_name/year=2022/month=11/ STOREAS 'JSON'
tasks.max=2
connect.s3.aws.auth.mode=Default
value.converter.schemas.enable=false
connect.s3.aws.region= ap-northeast-2
value.converter=org.apache.kafka.connect.storage.StringConverter
key.converter=org.apache.kafka.connect.storage.StringConverter
The connector is normally created and data is read from S3 to the specified topic, but there are two problems here.
- As described in "connect.s3.kcql", data is imported based on /year=2022/month=11/, but other partitioned month and date data are also imported. It seems that the paths of "/year=" and "/month= " specified under /dev/domain_name(=PREFIX_NAME) are not recognized and all are imported. I wonder if there is a way.
(refer to my full s3 path: my_bucket/dev/domain_name/year=2022/month=11/hour=1/*.json )
- The json file exists more in the specified s3 path, but it is no longer imported into the topic. No errors occur and this is normal. When I look at the connector log, I keep getting "flushing 0 outstanding messages for offset commit" message.