0

I'm trying to get maximum sequential disk throughput for an application on my Linux server. It has 7 SATA disks that I could put in one RAID0 or RAID5 using a hardware RAID controller (HP P420i). Or, I could use them separately and put a filesystem on each one separately and mount them under /mnt/a, /mnt/b, ... /mnt/g. (The RAID/SATA controller could assign each disk to its own 1-disk volume.)

I have a biggish-data application where multiple (up to 10) process may be doing sequential writing and reading of files/objects concurrently. Using a single RAID volume, they would all be writing to the same filesystem and perhaps having some contention on the same RAID system and filesystem. The poor RAID controller may get too busy and not be as fast as I want it to be. On the other hand, using /mnt/{a..g}, I can introduce some sharding logic on the application layer so that based on the name of the 'object' being written, one of /mnt/{a..g} is selected for storing that object so that the processes don't end up all writing to the same RAID and filesystem, perhaps avoid performance problems related to RAID or filesystem contention.

I was previously under the impression that the sequential throughput of a RAID5 scales somewhat linearly with the number of disks up to some not-too-small number but my recent experience tells me that the reality is not even close. (On 7-disk RAID5 ext4, I'm getting only 160MB/s write 320MB/s read.) Hence, I'm thinking of alternatives for maximizing total sequential disk throughput. Am I likely to get better total throughput for my 10 processes if I mount the 7 disks separately and use the 7 filesystems separately concurrently?

Syncopated
  • 267
  • 1
  • 2
  • 5
  • *On 7-disk RAID5 ext4, I'm getting only 160MB/s write 320MB/s read.* 6+1 RAID5 is not optimal - there's no way to match IO block size to the RAID5 stripe size. And IO block size is determined both by the IO block size used by the application and the file system block size. See http://www.tomshardware.com/reviews/RAID-SCALING-CHARTS,1735-4.html You also have to worry about the *alignment* of IO operations on RAID5 - especially writes. Updating only part of a stripe forces the controller to have to recalculate the parity for the entire stripe, thus causing one or more read operations. – Andrew Henle Jun 16 '16 at 11:58
  • 1
    Do you realize that with RAID0 if one disk goes the entire set is useless? I would not recommend it even for temporary data. – John Mahowald Jun 16 '16 at 12:42
  • @JohnMahowald I know. The data is very transient and can be regenerated in a few hours. – Syncopated Aug 02 '16 at 04:42
  • I just switched from RAID5 to RAID0 and the write speed at least doubled. I don't understand as the writes are mostly sequential writes to large files that are at least 100MB. – Syncopated Aug 02 '16 at 04:47

1 Answers1

0

Providing a proper answer to your question goes way beyond the scope of a post here - it's way too broad. But the conclusion would be the same - assuming that redundancy is irrelevant, then the difference will depend on the nature of the workload (although as Andrew points out, it's a lot easier to mis-configure a RAID system than a single disk filesystem).

The definitive answer, and one far better than you'll get here, comes from running lots of representative workloads through your system and measuring the results.

If you are particularly concerned about performance then starting with all your hardware in place seems an odd way to approach the problem; unless you specifically need the capacity, then swapping one of the HDs for an SSD configured as bcache/journalling might be a better solution.

symcbean
  • 21,009
  • 1
  • 31
  • 52