I have a Kafka topic (test-topic) with 3 partitions, and a set of messages which contain a key that can take only 3 types of values, I want these messages to go to separate partitions based on their value.
from kafka import KafkaProducer
from kafka.partitioner import DefaultPartitioner
messages = [{"partition_key":"k1", "x":1},
{"partition_key":"k2", "x":2},
{"partition_key":"k3", "x":3},
{"partition_key":"k1", "x":4},
{"partition_key":"k2", "x":5}]
partitioner = DefaultPartitioner()
all_partitions = list(range(100))
available = all_partitions
dataPartitioner = partitioner(b'partition_key', all_partitions, available)
producer = KafkaProducer(bootstrap_servers="localhost:9092", value_serializer=lambda v: json.dumps(v).encode('utf-8'), partitioner = dataPartitioner)
for m in messages:
producer.send('test-topic', m)
producer.flush()
In the above code, I want messages whose partition_key value is the same to go to the same partition.