Using Kafka and Spring boot kafka stream listener in my application. The configurations are: 160 partitions and the HPA is set to 6-40 so under peak load the pods scale to 40. The problem is that a single message on a single offset is getting consumed multiple times, sometimes as many as 15 times which is causing bottlenecks on DB and other problems at times.
log.info("Acknowledging message offset : {} for correlation ID {} time taken {} groupId {}", offset, message.getPayload().getCorrelationId(), (System.currentTimeMillis() - startTime), groupId);
the same correlation id and same offset is being committed by multiple pods. I also printed the group ID so as to ensure all the pods are in same consumer group few of the logs are as below:
03-10-2020 15:39:23.966 [35m[KafkaConsumerDestination{consumerDestinationName='topic-input', partitions=0, dlqName='null'}.container-4-C-1][0;39m [34mINFO [0;39m c.c.g.m.m.c.MessageConsumerImpl.processCreateWorkflow - Acknowledging message offset : 118347 for correlation ID f5979128-f69c-4b32-b510-6972ef1cadff time taken 2994 groupId topic
==========
03-10-2020 15:39:23.947 [35m[KafkaConsumerDestination{consumerDestinationName='topic-input', partitions=0, dlqName='null'}.container-5-C-1][0;39m [34mINFO [0;39m c.c.g.m.m.c.MessageConsumerImpl.processCreateWorkflow - Acknowledging message offset : 119696 for correlation ID f5979128-f69c-4b32-b510-6972ef1cadff time taken 1434 groupId topic
The Pod id for those 2 logs are : 5f812fe2-38d4-430e-a395-943d77747b3a and 194786f6-83df-4288-959c-7fcfb42de45a respectively.(they are different)
Another strange thing is both of them consuming messages from the same partition i.e. partition 0 none of the logs for any other partitions are printed though when we try to see we do see messages in other partitions. Below are the configs for spring boot kafka stream listener:
stream:
kafka:
binder:
brokers: brkr:9092
zkNodes: zk1:2181,zk2:2181,zk3:2181
autoCreateTopics: false
# retry configs
producerProperties:
key.serializer: org.apache.kafka.common.serialization.StringSerializer
value.serializer: org.apache.kafka.common.serialization.ByteArraySerializer
retries: 3
max.in.flight.requests.per.connection: 1
retry.backoff.ms: 9000
request.timeout.ms: 400000
delivery.timeout.ms: 450000
applicationId: ng-member-service
consumerProperties:
key.deserializer: org.apache.kafka.common.serialization.StringDeserializer
value.deserializer: org.apache.kafka.common.serialization.ByteArrayDeserializer
#max.poll.records: 100
auto.offset.reset: earliest
session.timeout.ms: 300000
request.timeout.ms: 400000
allow.auto.create.topics: false
heartbeat.interval.ms: 80000
default:
consumer:
autoCommitOffset: false
producer:
messageKeyExpression: payload.entityId
bindings:
topic-input:
consumer:
configuration:
max.poll.records: 350
default:
consumer:
partitioned: true
concurrency: 4
producer:
partitionKeyExpression: payload.entityId
#partitionCount: 4
bindings:
topic-input:
destination: topic-input
group: topic
consumer:
maxAttempts: 1
partitioned: true
concurrency: 12
Any help on above would be appreciated.