I have a single node Kafka setup on a VM with 56GB RAM and 750GB disk with one broker.
This is the what the server.properties file looks like:
broker.id=1
listeners=PLAINTEXT://hostname:port
num.network.threads=15
num.io.threads=30
socket.send.buffer.bytes=1024000
socket.receive.buffer.bytes=1024000
socket.request.max.bytes=1048576000
log.dirs=/path/to/log/
num.partitions=1
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=2160
log.retention.bytes=500000000000
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeeper.connect=zkhostname:2181
zookeeper.connection.timeout.ms=6000
group.initial.rebalance.delay.ms=0
message.max.bytes=10485760
I have several consumers and producers working on various topics, with a 1:1 correspondence between the topic partitions and consumers (I actually have 1 partition and one consumer mostly). Average size of each message of mine is 500kb.
Let's say the throughput I am getting for each consumer (who produces to another topic after some processing) is around 200 records per second.
For a particular topic, I have 10 partitions and 10 consumers hoping the processing would be 10 times faster (parallel consumption and pushing).
The throughput was split between each of the consumers, probably something like 20 records per second per each consumer. The only reason I could think this might be happening is Kafka resources have hit some limits?
On the VM, if I do a free -m
, the result is something like this:
total used free shared buff/cache available
Mem: 56339 12055 35087 24 9196 43428
Swap: 0 0 0
I read that Kafka uses pagecache
a lot under the hood so I am confused if this is the right behavior.
I tried setting this
export KAFKA_HEAP_OPTS="-Xmx16G -Xms16G"
in the kafka-server-start.sh
but doesn't seem to help.
If it is a memory issue or some other resource exhausted issue, how do I diagnose with Kafka? Am I missing some broker-level configs? I need to understand why or how my Kafka server's performance is lost.