Alpakka/Kafka - Partitions consumed faster than others

Question

I’ve been using alpakka kafka to streaming data from kafka topics. I’m using:

Consumer
      .committableSource(consumerSettings, Subscriptions.topics(topic))

Recently I’ve tried to spam more consumers like 3 on a topic which has 15 partitions. When I plug more consumers with the same group id, it kindly split 5 partitions per consumer, but it seems to do not consume all partitions at the same time, it seems to read one by one, or read a specific partition much faster than others.

|Partition|LogSize  |Consumer Offset|Lag      |
|0        |8,429,145|      6,087,144|2,342,001|
|1        |8,424,948|      6,223,257|2,201,691|
|2        |8,428,121|      7,764,854|  663,267|
|3        |8,421,528|      6,071,425|2,350,103|
|4        |8,434,659|      7,351,552|1,083,107|
|5        |8,428,323|      5,935,336|2,492,987|
|6        |8,424,974|      6,455,301|1,969,673|
|7        |8,431,820|      7,763,984|  667,836|
|8        |8,425,999|      6,370,962|2,055,037|
|9        |8,416,354|      6,681,093|1,735,261|
|10       |8,416,217|      6,814,949|1,601,268|
|11       |8,428,026|      5,878,703|2,549,323|
|12       |8,424,604|      8,424,589|       15|
|13       |8,431,019|      8,431,019|        0|
|14       |8,423,218|      8,423,218|        0|

Here is a real example of a production application I’m running. So I have some questions:

Is it ok to read some partitions much faster than others?

Please, note that this behavior only happens when I start more than one consumer.

Should I change the way I’m consuming? Should I use source per partition, or is there a different option?

Update

I was suspecting that it could happen when plugging more than one consumer(read more than one application), but it happened today using only one consumer, you can see by taking a look at the consumer group, which is the same.

At the time it happened, I had 20MM of messages still waiting to be processed(lag). The above picture is a picture taken from the Kafka manager we have at the company.

How long does it take for the other partition to catch up? Some inconsistency is expected as the rate of consumption depends on thread scheduling. — dvim, Sep 25 '18 at 23:54
I would say 30m, 1h, it could vary. Could this problem be related to a bad producer? — Thiago Pereira, Sep 26 '18 at 03:09
This is a snapshot topic(compaction enabled). The producer writes to a normal(delete) topic and a kafka-streams replicates it's data to the compaction topic. I wonder if this replication is causing this problem on the consumer side. — Thiago Pereira, Sep 26 '18 at 17:23

score 0 · Accepted Answer · answered Jul 17 '19 at 09:30

We solve this problem by removing one of our components that replicated messages from one topic to another one.

Essentially, producers were writing to a topic and this component replicated these messages to another topic, with compaction enabled, keeping the last state for a given id. It turns out that this component wasn't working properly and consumers attached to this compaction topic were having some issues.

So, in the end, who needed a compaction topic, let producers write to it directly instead.

Alpakka/Kafka - Partitions consumed faster than others

1 Answers1