How to slow down kafka consumer when there is load on the DB and vice versa

Question

If a lot of message are produced on kakfa and if I try to consume (using python confluent_kafka library) and process them, the database (working on mysql DB) gets loaded with a lot of queries quickly. I want to slow down the consuming speed based on the load on the DB. I was thinking of using time.sleep() in the consumer loop. This way I can provide a larger time to sleep if there is a load on the DB. I will be fetching the number of seconds to sleep from redis key. Similary will change the value of the redis key when there is a certain amount of load on the db. Like set the value of key to 30s if db load > 80 % or something like that.

I am stuck on how I can calculate the load on the DB.

Also is there a another way of controlling the consuming speed, then please tell.

Is there a specific reason to use Python rather than Kafka Connect JDBC sink? — OneCricketeer, Jun 03 '22 at 16:58
Instead, turn on the slowlog with a small `long_query_time`. That will help locate the "worst" queries. Then let's work on speeding them up! — Rick James, Jun 03 '22 at 20:17

score 0 · Answer 1 · answered Jun 03 '22 at 08:17

As long as you are careful to not stop consuming for too long (by default, the maximum time between poll() invocations is 5 minutes) you can slow down your consumption in any way you'd like.

You can also pause and resume as needed instead of slowing down consumption introducing a sleep.

A different approach to organically limit the load in your database could be to limit the amount of concurrent connections that your DB client can make and adjust it to keep a constant, controlled, load.

How to slow down kafka consumer when there is load on the DB and vice versa

1 Answers1