8

We increased the number of partitions to parallel process the messages as the throughput of the message was high. As soon as we increased the number of partitions all the streams thread which were subscribed to that topic died. We changed the consumer group id then we restarted the application it worked fine.

I know that the number of partitions changelog topic of application should be same as source topic. I would like to know the reason behind this.

I saw this link - https://issues.apache.org/jira/browse/KAFKA-6063?jql=project%20%3D%20KAFKA%20AND%20component%20%3D%20streams%20AND%20text%20~%20%22partition%22

Couldn't find the reason

https://github.com/apache/kafka/blob/fdc742b1ade420682911b3e336ae04827639cc04/streams/src/main/java/org/apache/kafka/streams/processor/internals/InternalTopicManager.java#L122

Basically, reason behind this if condition.

kartik7153
  • 333
  • 2
  • 11
  • Have you checked whether new threads were started? IIUC, repartitioning should trigger a [partition grouper](https://kafka.apache.org/10/documentation/streams/developer-guide/config-streams.html#streams-developer-guide-partition-grouper) which reassigns stream tasks to topics. In this case, you might be able to reuse your existing threads by providing a custom partition grouper. – daniu Feb 12 '19 at 12:52
  • @daniu No it is not creating new thread. It raise an exception and the thread dies – kartik7153 Feb 12 '19 at 12:59

2 Answers2

5

Input topic partitions define the level of parallelism, and if you have stateful operations like aggregation or join, the state of those operations in sharded. If you have X input topic partitions you get X tasks each with one state shard. Furthermore, state is backed by a changelog topic in Kafka with X partitions and each shard is using exactly one of those partitions.

If you change the number of input topic partitions to X+1, Kafka Streams tries to create X+1 tasks with X store shards, however the exiting changelog topic has only X partitions. Thus, the whole partitioning of your application breaks and Kafka Streams cannot guaranteed correct processing and thus shuts down with an error.

Also note, that Kafka Streams assume, that input data is partitioned by key. If you change the number of input topic partitions, the hash-based partitioning changes what may result in incorrect output, too.

In general, it's recommended to over-partition topics in the beginning to avoid this issue. If you really need to scale out, it is best to create a new topic with the new number of partitions, and start a copy of the application (with new application ID) in parallel. Afterwards, you update your upstream producer applications to write into the new topic, and finally shutdown the old application.

Matthias J. Sax
  • 59,682
  • 7
  • 117
  • 137
  • So there is no other way apart from resetting consumer group ids with application reset tool? – kartik7153 Feb 13 '19 at 05:24
  • Exactly. It's a very hard problem, because state is partitioned, and repartitioning state is non-trivial at all. – Matthias J. Sax Feb 13 '19 at 17:45
  • We have around 100 consumers consuming from the same topic. Doing this for every consumer is a very tedious task. – kartik7153 Feb 14 '19 at 04:57
  • Yes. It's recommended to over-provision topics to avoid this situation... – Matthias J. Sax Feb 14 '19 at 07:43
  • https://stackoverflow.com/questions/60456751/kafka-producer-how-to-change-a-topic-without-down-time-and-preserving-message Please, how to create a new topic and make producer to write messages to a new topic without down-time and with correct message ordering? Of course, if you find time and have some ideas. – Yan Khonski Feb 28 '20 at 18:30
0

If stateful clients fail due to this, then delete the changelog topic and the local state store.

Valath
  • 880
  • 3
  • 13
  • 35