0

I have a single node Kafka setup on a VM with 56GB RAM and 750GB disk with one broker.

This is the what the server.properties file looks like:

broker.id=1
listeners=PLAINTEXT://hostname:port
num.network.threads=15
num.io.threads=30

socket.send.buffer.bytes=1024000
socket.receive.buffer.bytes=1024000
socket.request.max.bytes=1048576000

log.dirs=/path/to/log/

num.partitions=1
num.recovery.threads.per.data.dir=1

offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1

log.retention.hours=2160
log.retention.bytes=500000000000
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000

zookeeper.connect=zkhostname:2181
zookeeper.connection.timeout.ms=6000

group.initial.rebalance.delay.ms=0
message.max.bytes=10485760

I have several consumers and producers working on various topics, with a 1:1 correspondence between the topic partitions and consumers (I actually have 1 partition and one consumer mostly). Average size of each message of mine is 500kb.

Let's say the throughput I am getting for each consumer (who produces to another topic after some processing) is around 200 records per second.

For a particular topic, I have 10 partitions and 10 consumers hoping the processing would be 10 times faster (parallel consumption and pushing).

The throughput was split between each of the consumers, probably something like 20 records per second per each consumer. The only reason I could think this might be happening is Kafka resources have hit some limits?

On the VM, if I do a free -m, the result is something like this:

               total        used        free      shared  buff/cache   available
Mem:          56339       12055       35087          24        9196       43428
Swap:             0           0           0

I read that Kafka uses pagecache a lot under the hood so I am confused if this is the right behavior.

I tried setting this

export KAFKA_HEAP_OPTS="-Xmx16G -Xms16G"

in the kafka-server-start.sh but doesn't seem to help.

If it is a memory issue or some other resource exhausted issue, how do I diagnose with Kafka? Am I missing some broker-level configs? I need to understand why or how my Kafka server's performance is lost.

Community
  • 1
  • 1
void
  • 2,403
  • 6
  • 28
  • 53

1 Answers1

2

On a single node Kafka cluster, it's not unexpected that adding more consumers will not increase throughput.

If the broker is already sending at its max capacity to 1 client, if you add a second client, the broker will now have to share its resources between both.

The strength of Kafka is that you can have several brokers in your cluster and each one can be the leader of some partitions. Then each consumer will be connected to a few different brokers so each consumer can use each broker's resources.

I'm deliberately trying to not talk about what might be your bottleneck (CPU, network, etc) but I'm just trying to explain why your base assumption "more consumers = more throughput" is not always valid.

Mickael Maison
  • 25,067
  • 7
  • 71
  • 68
  • My single node VM has a single SSD disk. Let's say if I add more brokers who use this same disk, can I expect a throughput increase? Or should all the different brokers be using different disks? – void Mar 22 '18 at 10:40
  • 2
    It depends and I'm afraid it's not something I can answer within a comment. You first have to identify what is your bottleneck and experiment with your config and your constraints – Mickael Maison Mar 22 '18 at 14:24