1

I’ve been using alpakka kafka to streaming data from kafka topics. I’m using:

Consumer
      .committableSource(consumerSettings, Subscriptions.topics(topic))

Recently I’ve tried to spam more consumers like 3 on a topic which has 15 partitions. When I plug more consumers with the same group id, it kindly split 5 partitions per consumer, but it seems to do not consume all partitions at the same time, it seems to read one by one, or read a specific partition much faster than others.

|Partition|LogSize  |Consumer Offset|Lag      |
|0        |8,429,145|      6,087,144|2,342,001|
|1        |8,424,948|      6,223,257|2,201,691|
|2        |8,428,121|      7,764,854|  663,267|
|3        |8,421,528|      6,071,425|2,350,103|
|4        |8,434,659|      7,351,552|1,083,107|
|5        |8,428,323|      5,935,336|2,492,987|
|6        |8,424,974|      6,455,301|1,969,673|
|7        |8,431,820|      7,763,984|  667,836|
|8        |8,425,999|      6,370,962|2,055,037|
|9        |8,416,354|      6,681,093|1,735,261|
|10       |8,416,217|      6,814,949|1,601,268|
|11       |8,428,026|      5,878,703|2,549,323|
|12       |8,424,604|      8,424,589|       15|
|13       |8,431,019|      8,431,019|        0|
|14       |8,423,218|      8,423,218|        0|

Here is a real example of a production application I’m running. So I have some questions:

Is it ok to read some partitions much faster than others?

Please, note that this behavior only happens when I start more than one consumer.

Should I change the way I’m consuming? Should I use source per partition, or is there a different option?

Update

I was suspecting that it could happen when plugging more than one consumer(read more than one application), but it happened today using only one consumer, you can see by taking a look at the consumer group, which is the same.

enter image description here

At the time it happened, I had 20MM of messages still waiting to be processed(lag). The above picture is a picture taken from the Kafka manager we have at the company.

Thiago Pereira
  • 1,724
  • 1
  • 17
  • 31
  • How long does it take for the other partition to catch up? Some inconsistency is expected as the rate of consumption depends on thread scheduling. – dvim Sep 25 '18 at 23:54
  • I would say 30m, 1h, it could vary. Could this problem be related to a bad producer? – Thiago Pereira Sep 26 '18 at 03:09
  • How many producers are there? – dvim Sep 26 '18 at 13:40
  • This is a snapshot topic(compaction enabled). The producer writes to a normal(delete) topic and a kafka-streams replicates it's data to the compaction topic. I wonder if this replication is causing this problem on the consumer side. – Thiago Pereira Sep 26 '18 at 17:23

1 Answers1

0

We solve this problem by removing one of our components that replicated messages from one topic to another one.

Essentially, producers were writing to a topic and this component replicated these messages to another topic, with compaction enabled, keeping the last state for a given id. It turns out that this component wasn't working properly and consumers attached to this compaction topic were having some issues.

So, in the end, who needed a compaction topic, let producers write to it directly instead.

Thiago Pereira
  • 1,724
  • 1
  • 17
  • 31