It seems Linux mdadm produces less writes & iops as more disks are added to an array. For example, I have tested the following configurations, with the defaults aside from changing the I/O scheduler to deadline and tuned-adm profile to throughput-performance:
Motherboard has Dual E5 processors, DDR4 RAM & 10 X SATA3 ports. SSD's are 10 X Samsung 850 Pro drives. OS is CentOS 7 64. (CentOS 6.7 was really bad). FS is xfs.
With roughly 4-6 drives, sequential writes bypassing cache are roughly 800 MB/s to 1 GB/s. Writing with cache is roughly 2-3GB/s.
Running various fio tests, iops seem to top out at about 80,000 iops with direct flag and of course 800,000+ without the direct flag.
Chunk size is 512k, the default. Partitions seem to be aligned properly.
When more disks are added to the array, the iops stay the same across the board, at roughly 60-80,000 iops and do not scale up linearly with the additional drives.
Additionally, when more drives are added, sequential writes seem to nosedive as if it were just a single drive. Testing a single drive for both iops & sequential writes yields about 70,000 iops (based on RW percentage) and 400-500 MB/s. Sequential are slightly lower with all 10 drives in the array, between 300-500 MB/s.
The sequential writes are not a deal-breaker however, I am wondering if there is a bottleneck or limitation within mdadm that is being overlooked. With 4-6 drives, it performs awesome. Beyond 6 drives, the performance seems to stay the same or drop off, especially with any sequential writes.
EDIT after some additional testing, I'm able to get the sequential speeds up when doing very large writes, such as 20GB, 40GB 80GB etc. A dd test with 42GB yielded 640 MB/s with fdatasync.
I also understand dd is not ideal for benchmarking SSD's - that's not my question, I am trying to understand where the drop off is coming from when going beyond the 4-6 disks.