HDFS Sink Connector troubleshooting distributed mode

Question

Kerberized Hadoop Cluster. Connect-Instance started on the edge node.In standalone mode, everything works as expected. In distributed mode, the connector is added based on the logs, but once checked with the REST call, no connectors are returned at all and the data is not written from Kafka topic to HDFS.

2019-05-16T12:22:53.657 TRACE xxx connector dev Submitting connector config write request xxx_connector_test_topic_2 (org.apache.kafka.connect.runtime.distributed.DistributedHerder:529)
2019-05-16T12:22:53.661 TRACE xxx connector dev Retrieving loaded class 'io.confluent.connect.hdfs.HdfsSinkConnector' from 'PluginClassLoader{pluginLocation=file:/data/home/u_rw_xxx/kafka-connect/confluent-4.1.1/share/java/kafka-connect-hdfs/}' (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:325)
2019-05-16T12:22:53.661 DEBUG xxx connector dev Getting plugin class loader for connector: 'io.confluent.connect.hdfs.HdfsSinkConnector' (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:107)
2019-05-16T12:22:53.665 TRACE xxx connector dev Class 'org.apache.kafka.connect.storage.StringConverter' not found. Delegating to parent (org.apache.kafka.connect.runtime.isolation.PluginClassLoader:100)
2019-05-16T12:22:53.666 TRACE xxx connector dev Retrieving loaded class 'org.apache.kafka.connect.storage.StringConverter' from 'sun.misc.Launcher$AppClassLoader@764c12b6' (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:325)
2019-05-16T12:22:53.696 TRACE xxx connector dev Handling connector config request xxx_connector_test_topic_2 (org.apache.kafka.connect.runtime.distributed.DistributedHerder:538)
2019-05-16T12:22:53.697 TRACE xxx connector dev Submitting connector config xxx_connector_test_topic_2 false [] (org.apache.kafka.connect.runtime.distributed.DistributedHerder:550)
2019-05-16T12:22:53.697 DEBUG xxx connector dev Writing connector configuration {connector.class=io.confluent.connect.hdfs.HdfsSinkConnector, tasks.max=1, topics=xxx_test_topic_2, hadoop.conf.dir=/usr/hdp/current/hadoop-client/conf/, hdfs.url=/dev/src/xxx/kk/land/test/, hdfs.authentication.kerberos=true, connect.hdfs.principal=u_rw_xxx@XXXHDP1.YYYY.ZZ, connect.hdfs.keytab=/data/home/u_rw_xxx/u_rw_xxx.keytab, hdfs.namenode.principal=nn/_HOST@XXXHDP1.YYYY.ZZ, hive.integration=false, hive.database=dev_src_xxx_data, partitioner.class=io.confluent.connect.hdfs.partitioner.TimeBasedPartitioner, format.class=io.confluent.connect.hdfs.json.JsonFormat, key.converter=org.apache.kafka.connect.storage.StringConverter, key.converter.schemas.enable=false, value.converter=org.apache.kafka.connect.storage.StringConverter, value.converter.schemas.enable=false, flush.size=100, rotate.interval.ms=60000, partition.duration.ms=300000, path.format='day'=YYYYMMdd, locale=DE, timezone=UTC, name=xxx_connector_test_topic_2} for connector xxx_connector_test_topic_2 configuration (org.apache.kafka.connect.storage.KafkaConfigBackingStore:294)
2019-05-16T12:22:53.993 INFO xxx connector dev 127.0.0.1 - - [16/May/2019:10:22:53 +0000] "POST /connectors/ HTTP/1.1" 201 1092  469 (org.apache.kafka.connect.runtime.rest.RestServer:60)

As the REST call to list connectors does not retrieve any connectors, I gues its creation has failed. For that,I would expect either some error message, or at least some warning.

However adding that connector through the REST API silently fails, very similarly to this Confluent CLI issue.

Any ideas on further troubleshooting are welcome. Thanks in advance!

P.S.:

As shown in log, using the Confluent 4.1.1 connector version, with the JSONFormat class, that is expected to be written out to HDFS serealized with the StringConverter.

Solution was the ACLs of the Kafka topics. In particular, the ACLs for the 3 "connect"-Topics used for starting the connect instance in the [distributed mode](https://docs.confluent.io/current/connect/userguide.html#distributed-mode). If using the older versions of Confluent platform, please make sure your user has the right to write into this topic, even when you start the connect instance with a technical user that should only consome the data. — LHA, Jun 13 '19 at 14:01
This is needed for the [TopicAdmin](https://github.com/apache/kafka/blob/1.1/connect/runtime/src/main/java/org/apache/kafka/connect/util/TopicAdmin.java#L208) class to try to recreate the connect topics on start-up. The reason for that is, the recreation and error handling of existing topic allows this operation to be atomic in the cluster. — LHA, Jun 13 '19 at 14:06

HDFS Sink Connector troubleshooting distributed mode

0 Answers0