4

I have an kafka streams application which has

props.put(ProducerConfig.PARTITIONER_CLASS_CONFIG, MyPartitioner.class);

or

props.put(ProducerConfig.PARTITIONER_CLASS_CONFIG, RoundRobinPartitioner.class);

which is a class for distributing messages to different partitions even using same key in kafka 2.4 version

RoundRobinPartitioner has this implementation:

public class RoundRobinPartitioner implements Partitioner {
    private final ConcurrentMap<String, AtomicInteger> topicCounterMap = new ConcurrentHashMap();

    public RoundRobinPartitioner() {
    }

    public void configure(Map<String, ?> configs) {
    }

    public int partition(String topic, Object key, byte[] keyBytes, Object value, byte[] valueBytes, Cluster cluster) {
        List<PartitionInfo> partitions = cluster.partitionsForTopic(topic);
        int numPartitions = partitions.size();
        int nextValue = this.nextValue(topic);
        List<PartitionInfo> availablePartitions = cluster.availablePartitionsForTopic(topic);
        if (!availablePartitions.isEmpty()) {
            int part = Utils.toPositive(nextValue) % availablePartitions.size();
            return ((PartitionInfo)availablePartitions.get(part)).partition();
        } else {
            return Utils.toPositive(nextValue) % numPartitions;
        }
    }

    private int nextValue(String topic) {
        AtomicInteger counter = (AtomicInteger)this.topicCounterMap.computeIfAbsent(topic, (k) -> {
            return new AtomicInteger(0);
        });
        return counter.getAndIncrement();
    }

    public void close() {
    }
}

and my Partitioner consists of exactly same code but different partition method implementation and my code block is:

    public int partition(String topic, Object key, byte[] keyBytes, Object value, byte[] valueBytes, Cluster cluster) {
        List<PartitionInfo> partitions = cluster.partitionsForTopic(topic);

        int numPartitions = partitions.size();

        int nextValue = nextValue(topic);

        return Utils.toPositive(nextValue) % numPartitions;

    }

When I configure like that messages are distributed to different partitions, in both implementation, but never use the some partitions.

My internal topic

I have 50 partitions and partition 14 and 34 never received a message. My partitions are not unavaliable. They are avaliable. When I change my return partition method to 14 or 34, all my messages go to that partition. What could be the problem ? Both implementations are not working as expected.

Edit 1: I have tried RoundRobinPartitioner with the plain producer. Result is the same. Producer can not produce messages equally amoung partitions, some partitions are never used. What could be the reason ? It is not like a missing configuration.

Edit 2: I have debug RoundRobinPartitioner and put a breakpoint at return. When I produce just 1 message, Producer produce message twice. First attempt is always unsuccessful and that message does not go any partition. When I hit continue at debugging index of the ConcurrentMap increases by 1. The second attempt of the producer is successful.

partition() method is invoked something where I could not find yet.

Edit 3: Could this be related with onNewBatch method which I did not override ?

Edit 4: This implementatin works for kafka client 2.2 but not for 2.4. Partition interface does not have onNewBatch method. DefaultPartitioner implementation is changed when key is null 2.2 vs 2.4. Can it be related with stick partitions ?

Alpcan Yıldız
  • 671
  • 3
  • 13
  • 33

1 Answers1

3

Use UniformStickyPartitioner.class in kafka 2.4 client version. RoundRobinPartitioner.class works for kafka 2.2 or lower versions. In 2.4 version

props.put(ProducerConfig.PARTITIONER_CLASS_CONFIG, UniformStickyPartitioner.class);

should be used. I think this is related with new StickPartitioner.

Alpcan Yıldız
  • 671
  • 3
  • 13
  • 33