120

I have a Kafka cluster running with 2 partitions. I was looking for a way to increase the partition count to 3. However, I don't want to lose existing messages on the topic. I tried stopping Kafka, modifying the server.properties file to increase the number of partitions to 3 and restart Kafka. However, that does not seem to change anything. Using Kafka ConsumerOffsetChecker, I still see it is using only 2 partitions. The Kafka version I am using is 0.8.2.2. In version 0.8.1, there used to be a script called kafka-add-partitions.sh, which I guess might do the trick. However, I don't see any such script in 0.8.2.

  • Is there any way of accomplishing this?

I did experiment with creating a whole new topic and for that one, it does seem to use 3 partitions as per the change in the server.properties file. However, for existing topics, it doesn't seem to care.

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
Asif Iqbal
  • 4,562
  • 5
  • 27
  • 31
  • The latest release of apache kafka is 0.8.2.2, I doubt you're using "2.10". You may want to check versions again. – C4stor Nov 12 '15 at 20:17
  • @C4stor Actually I meant Kafka that is based on Scala 2.10, which exactly the version you mentioned 0.8.2.2. Sorry for the confusion. I will edit my question. – Asif Iqbal Nov 12 '15 at 20:31

7 Answers7

173

Looks like you can use this script instead:

bin/kafka-topics.sh --zookeeper zk_host:port/chroot --alter --topic my_topic_name 
   --partitions 40 

In the code it looks like they do same thing:

 AdminUtils.createOrUpdateTopicPartitionAssignmentPathInZK(topic, partitionReplicaList, zkClient, true)

kafka-topics.sh executes this piece of code as well as AddPartitionsCommand used by kafka-add-partition script.

However you have to be aware of re-partitioning when using key:

Be aware that one use case for partitions is to semantically partition data, and adding partitions doesn't change the partitioning of existing data so this may disturb consumers if they rely on that partition. That is if data is partitioned by hash(key) % number_of_partitions then this partitioning will potentially be shuffled by adding partitions but Kafka will not attempt to automatically redistribute data in any way.

blockR
  • 1,937
  • 1
  • 11
  • 8
  • 3
    If the data has to be repartition, is there a way only move over messages that haven't been read and disregard messages that have been read? – Glide Sep 21 '16 at 13:33
  • 7
    in line with the 'append-only' philosophy, I'd imagine you'd have to great lengths to achieve this. I'd say the simplest is to halt consumption on that topic, create a new topic with the amount of partitions you want, republish the unread messages onto the new topic and then continue consumption off the new topic. – CmdrDats Feb 21 '18 at 10:22
  • 1
    @CmdrDats, would you mind letting me know if there are any improvements in this area or we still to address this using "republish" method you suggested above. – Nag Jun 03 '20 at 16:01
24

For anyone who wants solution for newer Kafka versions.Please follow this method.

Kafka's entire data retention and transfer policy depends on partitions so be careful about effects of increasing partitions. (Kafka's newer versions display warning regarding this) Try to avoid configuration in which one broker has too many leader partitions.

There is simple 3 stage approach to this.

Step 1: Increase the partitions in topics

./bin/kafka-topics.sh --bootstrap-server localhost:9092 --alter --topic testKafka_5 --partitions 6

Step 2: Create a partitioning json file for given topic

{ "version":1, "partitions":[ {"topic":"testKafka_5","partition":0,"replicas":[0,1,2]}, {"topic":"testKafka_5","partition":1,"replicas":[2,1,0]}, {"topic":"testKafka_5","partition":2,"replicas":[1,2,0]}, {"topic":"testKafka_5","partition":3,"replicas":[0,1,2]}, {"topic":"testKafka_5","partition":4,"replicas":[2,1,0]}, {"topic":"testKafka_5","partition":5,"replicas":[1,2,0]} ]}

Create file with newer partition and replicas. It's better to expand replicas to different brokers but they should be present within same cluster. Take latency into consideration for distant replicas. Transfer the given file to your Kafka.

Step 3: Reassign partitions and verify

./bin/kafka-reassign-partitions.sh --bootstrap-server localhost:9092 --reassignment-json-file bin/increase-replication-factor.json  --execute

./bin/kafka-reassign-partitions.sh --bootstrap-server localhost:9092 --reassignment-json-file bin/increase-replication-factor.json --verify

You can check the effects of your change using --describe command.

andreagalle
  • 620
  • 6
  • 17
c0der512
  • 546
  • 5
  • 18
11

If you are using Kafka in Windows try this code for alter or add partition in topic

.\bin\windows\kafka-topics.bat --alter --zookeeper localhost:2181 --topic TopicName --partitions 20

or

.\bin\windows\kafka-topics.bat --alter --zookeeper localhost:2181 --topic TopicName --replica-assignment 0:1:2,0:1:2,0:1:2,2:1:0 --partitions 10

Paulo Merson
  • 13,270
  • 8
  • 79
  • 72
Aarya
  • 121
  • 1
  • 4
10

I think this question is a bit old, but still I will answer.

If you have a Kafka topic but want to change the number of partitions or replicas, you can use a streaming transformation to automatically stream all the messages from the original topic into a new Kafka topic which has the desired number of partitions or replicas.

BERGUIGA Mohamed Amine
  • 6,094
  • 3
  • 40
  • 38
7

In my case the value zk_host:port/chroot for parameter --zookeeper threw the following exception:

ERROR java.lang.IllegalArgumentException: Topic my_topic_name does not exist on ZK path zk_host:port/chroot.

So, I tried the following and it worked:

 bin/kafka-topics.sh --alter --zookeeper zk_host:port --topic my_topic_name --partitions 10
DaveyDaveDave
  • 9,821
  • 11
  • 64
  • 77
Chandan Kumar
  • 99
  • 1
  • 8
  • `chroot` is an *optional* configuration setting, not meant to be taken literally – OneCricketeer Nov 16 '17 at 01:28
  • just make sure that all replication set/ brokers are up. Otherwise, it will throw this error: RROR org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor: 2 larger than available brokers: 1. – codebased Oct 22 '19 at 07:29
7

In

kafka_2.13-3.2.0

This worked for me:

/bin/kafka-topics.sh --bootstrap-server localhost:9092 --alter --topic apache_event_log_topic --partitions 4
Ali Ait-Bachir
  • 550
  • 4
  • 9
-3

Code to increase Kafka partition count, in SringBoot using AdminCLient

public void updatePartitionCount(Topic topic,AdminClient adminClient){
        Map<String, NewPartitions> newPartitions = new HashMap<>();
        newPartitions.put(topic.getName(), NewPartitions.increaseTo(5));
        CreatePartitionsOptions options = new CreatePartitionsOptions();
        adminClient.createPartitions(newPartitions);
        System.out.println("in partition count update");

    }`````
  • 1
    This won't work with Kafka 0.8 asked in the question – OneCricketeer Apr 26 '22 at 13:33
  • @OneCricketeerI was searching for the code to update count, I found command everywhere, But this code is working for me So I posted for others help, – Chitra Singh Apr 26 '22 at 16:49
  • Thats fine, but did you actually verify this works against a Kafka 0.8 environment? Because the AdminClient API didnt exist in that version of Kafka. In that version you [had to use `AdminUtils` (based on Zookeeper)](https://stackoverflow.com/a/33679300/2308683) – OneCricketeer Apr 26 '22 at 17:44
  • 1
    you are correct thanks for update. – Chitra Singh Apr 27 '22 at 07:28
  • @OneCricketeer how we can sure kafka partition count is updated or not, as if not updated then need to reverse the changes in DB also – Chitra Singh May 02 '22 at 14:58
  • Please create a new post rather than add comments unrelated to the answer here. But, databases aren't related to Kafka changes, or will know about it until a client metadata refresh, assuming you're referring to Kafka Connect – OneCricketeer May 02 '22 at 15:34