So, I have a consumer group and I have a few distinct nodes, each acting as a consumer. Each node is supposed to perform some computation intensive task. I want to make a consumer join to this consumer group only when it has available CPU resources. Once it has joined, it will consume a message from the topic regarding what computation it needs to perform and then start the computation. Now as this consumer is engaged in a computational task, I want to make it exit from the consumer group as it doesn't have any further capability to perform new computations. Is this possible to do in kafka? Or maybe there is another better way to do the above thing? I am using the kafka-python library.
-
Maybe it is possible with Kubernetes. I remember once that one colleague told me that with Kubernetes it is possible to monitor resources of your services and for example if you see that majority of CPU of one service is used or for some reason that service is down, you can start new service. Also, if Kubernetes spot that many of services have unused memory it can shutdown some of them. Basically you should provide 2 endpoints on those services. One check condition for start and one for shutdown. You can research a little bit more on this topic and ask senior kubernetes guy for help. :) – Spasoje Petronijević Sep 05 '19 at 21:32
-
@SpasojePetronijević actually, major concern was how to make a consumer leave and join a consumer group. But, like you suggested, I will ask senior kuberbetes guys for help regarding how to monitor the CPU resources. Thanks :) – Prashantha Sep 06 '19 at 05:20
1 Answers
In general, regardless of the Kafka client, this is possible to do using any Kafka consumer. The way to do it is simply subscribe to the topic, consume the message you want to process, acknowledge only that specific message, and close the consumer.
Specifically in the Kafka python client, the method you want is KafkaConsumer.close
. Make sure to set auto-commit to false
, because your poll might have consumed more than the messages you want to compute, and you only want to acknowledge the one you're actually going to work on.
Alternatively, you can set your consumer properties (specifically max.poll.records
) to fetch only 1 message per poll, and then you can use the .close
method with auto-commit
set to true.
More info on all the KafkaConsumer configuration options here: https://kafka.apache.org/documentation/#consumerconfigs
And here's a link to the official kafka-python client KafkaConsumer
docs:
https://kafka-python.readthedocs.io/en/master/apidoc/KafkaConsumer.html#kafka.KafkaConsumer.close

- 16,372
- 11
- 56
- 73
-
Oh, so if I close down the consumer, it would automatically leave the group, and when needed I can start the consumer again by joining the same group. Makes sense. Will try it out. Also, is there a way to just acknowledge the one message which I am actually going to work on when setting auto-commit to false? – Prashantha Sep 06 '19 at 03:48
-
Happy to help. If this answer or any other one solved your issue, please mark it as accepted. – mjuarez Sep 06 '19 at 03:51
-