3

We performed a kafka benchmark (BM) in order to figure out the maximum throughput (TP) available with the given kafka brokers and disks.

kafka brokers setup (machine spec & disks):

3 kafka brokers, Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz, 8 cores.

each broker has sdb device mounted to /var/kafka, in size 14.6T.

the sdb device is combined of 16 SAS disks ~1TB in RAID-10. which means 8 disks are used as parity.

kafka producer configuration:

  • key=string, value=byteArray

  • enable.auto.commit=false

  • buffer.memory=500000000

  • batch.size=262144

  • retry.backoff.ms=5

  • linger.ms=20000

  • retries=0

  • compression.type=lz4

  • acks=1

kafka topic configuration

100 partitions, balanced between all 3 brokers

replication factor = 3

how the kafka BM was performed

we injected messages using a proprietary KakkaInjector tool messages.

the messages were in size ~1K and were sent into all 100 partitions (equally) for consecutive 2.5 hours.

the BM goal was to see what's the maximal TP that can be achieved without reaching more than ~80%-85% IO utilization%.

kafka BM results (throughput and IO utilization%)

enter image description here

so with ~85% IO utilization in all 3 brokers, the rate of messages/sec was 550,000 msgs being read & 550,000 msgs being written.

If we look at the TP in kB measures, then all 3 brokers reached tota of 380 rKB/s and 495 wKB/s.

my questions

these results were achieved with 3 kafka brokers X 16 SAS disks X 1TB. we want to reach ~1.5M messages/sec instead of the current rate of 550K msgs/sec.

so my question is:

  • is adding more disks to each broker will increase linearly the number of msgs being read and written?

  • is adding more brokers with the same disks setup will increase linearly the number of msgs being read and written?

  • if we change the RAID from RAID-10 to RAID-0, will the TP increase by 2X?

  • if we change the disks from SAS to SSD will it increase the TP?

Elad Eldor
  • 803
  • 1
  • 12
  • 22

1 Answers1

2

is adding more disks to each broker will increase linearly the number of msgs being read and written?

Yes, but not always. It depends on a disk type and RAID scheme. If you increase the number of IOPS of your disk subsystem, it will help you.

Now you have 16 disks in RAID0, so even in the ideal situation, if you will add 2 more disks, it will work slightly faster, but definitely will not give a significant impact on reaching your goal.

is adding more brokers with the same disks setup will increase linearly the number of msgs being read and written?

Yes, but not always. You have replication factor = 3, which means that even if you add 1 or 2 more brokers, at least one of your brokers will process more topics than others, which means it will be overloaded and your application will wait before it completes a task. But, if you will add N*3 brokers - it will help.

if we change the RAID from RAID-10 to RAID-0, will the TP increase by 2X?

Not 2X, but yes, it will be faster than now. At least, you will have more parallel threads.

if we change the disks from SAS to SSD will it increase the TP?

Yes, definitely. Now you need more parallel IOPS, and SSD will give it to you. You now have 100 partitions, maybe you will able to set more if you will have SSD disks which are much faster in parallel operations.

Anton Kostenko
  • 8,200
  • 2
  • 30
  • 37