0

I have a Kafka topic (test-topic) with 3 partitions, and a set of messages which contain a key that can take only 3 types of values, I want these messages to go to separate partitions based on their value.

from kafka import KafkaProducer
from kafka.partitioner import DefaultPartitioner

messages = [{"partition_key":"k1", "x":1},
            {"partition_key":"k2", "x":2},
            {"partition_key":"k3", "x":3},
            {"partition_key":"k1", "x":4},
            {"partition_key":"k2", "x":5}]

partitioner = DefaultPartitioner()
all_partitions = list(range(100))
available = all_partitions
dataPartitioner = partitioner(b'partition_key', all_partitions, available)

producer = KafkaProducer(bootstrap_servers="localhost:9092", value_serializer=lambda v: json.dumps(v).encode('utf-8'), partitioner = dataPartitioner)

for m in messages:
  producer.send('test-topic', m)
producer.flush()

In the above code, I want messages whose partition_key value is the same to go to the same partition.

Devaraj Phukan
  • 196
  • 1
  • 9

1 Answers1

0

You need to write your custom implementation of the Partitioner interface, and give that class to the KafkaProducer at the time of initialisation.

E.g.,

 private static Properties createProducerConfig(String brokers) {
    Properties props = new Properties();
    props.put("bootstrap.servers", brokers);
    //more properties
    props.put("partitioner.class","com.app.KafkaUserCustomPatitioner");
    return props;
    }
hoodakaushal
  • 1,253
  • 2
  • 16
  • 31