I was running my kafka-connect on two ec2 machines. So irrespective of number of tasks, these two machines would always stay up running tasks. Hence under used machines. Recently I migrated kafka-connect on kubernetes. I achieved good cpu/memory efficiency.
But the problem arises when downscaling of kubernetes happens. Downscaling of pods does not happen gracefully.
Eg. Suppose there are 2 pods p1 and p2. p1 is running 3 tasks t1,t2,t3 p2 is running 2 tasks t4,t5 (here task t5 is task for source connector that brings data from postgres to kafka)
When any pod vanishes during downscaling, tasks running on it are rebalanced on other pods. Suppose pod p2 vanishes.
After task rebalancing new state of cluster is:- P1 is running 5 tasks t1,t2,t3,t4_new,t5_new
But logs for my source connector says that some other task(presumably task running on older pod t5) is still running and accessing postgres db data.
How can i make sure whenever pod downscales, it happens gracefully in the sense that all task running on pod are stopped.