5

I am using Confluent 3.2 in a set of Docker containers, one of which is running a kafka-connect worker.

For reasons yet unclear to me, two of my four connectors - to be specific, hpgraphsl's MongoDB sink connector - stopped working. I was able to identify the main problem: The connectors did not have any tasks assigned, as could be seen by calling GET /connectors/{my_connector}/status. The other two connectors (of the same type) were not affected and were happily producing output.

I tried three different methods to get my connectors running again via the REST API:

  • Pausing and resuming the connectors
  • Restarting the connectors
  • Deleting and the creating the connector under the same name, using the same config

None of the methods worked. I finally got my connectors working again by:

  • Deleting and creating the connector under a different name, say my_connector_v2 instead of my_connector

What is going on here? Why am I not able to restart my existing connector and get it to start an actual task? Is there any stale data on the kafka-connect worker or in some kafka-connect-related topic on the Kafka brokers that needs to be cleaned?

I have filed an issue on the specific connector's github repo, but I feel like this might actually be general bug related to the intrinsics of kafka-connect. Any ideas?

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
jurgispods
  • 745
  • 8
  • 19
  • 1
    This would be specific to the connector plugin. Task assignment is the responsibility of the connector implementation so likely the conditions for a task being started weren't met. – dawsaw Apr 08 '17 at 23:52
  • Where exactly do you take this information from? I don't see any task assignment logic in the connector implementation I am using: https://github.com/hpgrahsl/kafka-connect-mongodb/blob/master/src/main/java/at/grahsl/kafka/connect/mongodb/MongoDbSinkConnector.java The connector simply creates #tasks copies of the config without instantiating any tasks. So there has to be some task starting logic inside the Kafka Connect runtime classes, am I right? – jurgispods Apr 13 '17 at 13:45

1 Answers1

2

I have faced this issue. If the resources are less for a SinkTask or SourceTask to start, this can happen.

Memory allocated to the worker may be less some time. By default workers are allocated 250MB. Please increase this. Below is an example to allocate 2GB memory for the worker running in distributed mode.

KAFKA_HEAP_OPTS="-Xmx2G" sh $KAFKA_SERVICE_HOME/connect-distributed $KAFKA_CONFIG_HOME/connect-avro-distributed.properties

Renukaradhya
  • 812
  • 2
  • 19
  • 31
  • Thanks, I will check if my issue can be tracked back to a resource problem! – jurgispods Apr 13 '17 at 08:14
  • I just checked, my Kafka Connect Docker container has 1GB of memory assigned, I don't think the resources are the cause of my task failures (5 connectors with a maximum total number of 8 tasks). But have an upvote! – jurgispods Apr 13 '17 at 13:40
  • what is the memory allocated to worker? use ps -ef |grep java and check what is the memory allocated to the worker. The whole docker has 1GB, but what is allocated to worker is important. – Renukaradhya Apr 13 '17 at 14:39
  • I was not quite clear. The worker process itself has 1GB of memory within my Docker container. I think that should be enough? – jurgispods Apr 14 '17 at 09:46
  • 1
    That may be enough. Even I am curious to know in what all conditions I can get unassigned state becuase I don't want to face this in production environment. I have posted a query in google groups. (https://groups.google.com/forum/#!topic/confluent-platform/vHzV411ClXQ). Let us quiet and get details from confluent guys. – Renukaradhya Apr 14 '17 at 09:51