I have piles of 1TB Samsung mSATA drives that I'm trying to re-purpose for sever storage. I purchased 24 - StarTech 25SAT22MSAT adapters and fitted them with 2 mSATA drives each. I then enabled the RAID0 feature on each adapter/carrier.
Next I fitted all 24 adapters into a SuperMicro SC216BAC-R920LPB chassis and connected the 6 SAS channels from the SAS-216A backplane to a BroadCom MegaRAID SAS 9361-24i card.
The MegaRAID 9361-24i card has 24 independent SAS/SATA channels and PCIe Gen3 x8 interface. I was expecting to be able to achieve read performance of 4-5 GB/s using this setup. However the fastest read speed I can get is ~2.1 GB/s.
I have been measuring performance using a simple (single threaded) timed cat operation:
$ sudo bash -c 'sync; echo 3 > /proc/sys/vm/drop_caches'
$ time cat data_20GB.bin > /dev/null
real 0m9.780s
user 0m0.165s
sys 0m5.969s
as well as using the (multi-thread) application I intended to run on this machine. Both approaches show similar performance.
One of the first things I did was measure the performance of a single adapter/carrier:
$ sudo bash -c 'sync; echo 3 > /proc/sys/vm/drop_caches'
$ time cat data_20GB.bin > /dev/null
real 0m38.639s
user 0m0.309s
sys 0m10.142s
My simple test shows the single-threaded read performance for each adapter is ~530 MB/s. If I round down and say 500 MB/s x 24 = ~12 GB/s. I believe this setup should easily be able to saturate the PCIe Gen 3 x8 link (8 GT/s).
The SuperMicro server is running Ubuntu 16.04 and its boot disks are separate from the RAID array. I have updated the both the Linux driver and the firmware to the latest versions. I have also verified that the RAID card is connected at the expected Gen 3 x8 rate.
The data access pattern for the intended application is mostly large sequential reads.
Here is how I first configured the RAID array:
storcli64 /c0 add vd type=raid5 name=storage drives=252:0-23 pdperarray=24 pdcache=on cached wb ra strip=256
This configuration gives read performance:
- Simple (timed cat): 1.98 GB/s
- Application: 2.04 GB/s
Here are some of the other things I've tried since then:
- Adjusting many if not all of the combinations of RAID configuration options (RAID0 vs RAID5, cached vs direct, readahead vs no readahead, all of the supported stripe sizes). None of the combinations achieve read speeds higher than 2 GB/s, many are much slower.
- Adjusting configuration options for the RAID card itself. Changing these options had almost no effect on performance.
- Creating an array with only 6 drives. I still measure read performance at ~2 GB/s, the same as when using all 24 drives. This seems like a major red flag but I can't figure out the cause. I get the same read performance whether I create the array with drives 0-5 or with drives 0, 4, 8, 12, 16, 20.
- Creating an array with 12 drives. Same issue, I measure read performance at ~2 GB/s.
I am completely stumped. Does anyone know why the hardware RAID read performance tops out at ~2 GB/s? Any suggestions are welcome.
Update
I set the RAID card to JBOD mode so each of the 24 drives appears in the OS as an individual block device. I then created a software (mdadm) RAID5 and repeated my performance measurements:
- Simple (timed cat): 4.06 GB/s
- Real application: 5.92 GB/s
This is the performance level I was expecting from the hardware RAID. Maybe I should have saved $500 and just bought the equivalent SAS HBA card instead.