0

we're using kafka brokers with:

  • 24 HT cores

  • 256GB RAM

  • 16 X 7.2K RPM NLSAS disks configured in RAID-10 and replication factor 3.

the sar -q command the runq-sz reached ~40 on average, load average reached 10 and the CPU was only 10% us and 5% sys.

that didn't explain the high value of runq-sz, so we ran sar -d and saw following:

  • the tps of the device is ~650
  • %util is ~30%
  • rd_sec is ~60K
  • wr_sec is ~200K.
  • there's a linear correlation between the tps and the %util.

I've got two questions:

1) Is there a way of decreasing the tps?

2) if there's such a way - will it decrease the %util?

Elad Eldor
  • 61
  • 3
  • 1
    What problem are you trying to solve? – John Mahowald May 07 '19 at 21:15
  • I try to reduce the %util because the amount of data written to the kafka will increase and I don't want to add more disks. – Elad Eldor May 10 '19 at 09:05
  • That is not a problem description, that is a metric. Storage systems exist that run near 100% utilized with adequate response times. Why do you think you exceed the IOPS capacity of your 16 spindles? What is the read/write distribution and size of your I/Os? What I/O response time do you require? – John Mahowald May 10 '19 at 12:48

0 Answers0