RAID 5 30TB File storage - filesystem and strip size on large files

Question

Nowadays my storage is 6TB and as I will grow up to 30TB in few months I would like to hear some tips/recommendations on filesystem and element strip size to not to have problems in future. 90% of files are 700MB-4GB (mainly large video files and archives)

Now I am using ext4 and 64KB strip size. Should I increase strip size to 128KB/256KB ? Would be zfs or xfs better than ext4? Actual usage is 85% read and 15% write. In the future when the enclosure is full, read will be 100% and I would like to have best through put rate.

What disk sizes and speeds? Anything over 2TB you're not gonna want RAID5, more like RAID6. — tombull89, Jan 28 '14 at 16:08
See this article about why you may want to use RAID6>RAID5 and why it may even not been enough. http://www.zdnet.com/blog/storage/why-raid-6-stops-working-in-2019/805 — mveroone, Jan 28 '14 at 16:44

score 2 · Accepted Answer · answered Jan 28 '14 at 16:33

Try that: do not use Raid 5 on anything 2gb or larger in drives ;) For 30tb I would even go with Raid 6 mirrored (i.e. 2 copies in software raid) to make sure I keep the data in case of corruption.

Now I am using ext4 and 64KB strip size. Should I increase strip size to 128KB/256KB ? >

Hard or software? Generally yes - it is a lot less work to read more data than to come back later. Not a Linux guy here - but SQL Server for example does read 64kb extends but tries to keep table data in linear blocks so IO is reduced. A good large file system will try the same, which means a larger than 64gb IO segment size is good.

I remember analysis of enterprise level Raid controllers that showed an increase in throughput at 512kb / 256kg compared to smaller sizes. Especially if you ahve enough caching to make it "stick" on the Raid controller level.

A lot also depends on read. LArge archives and filesa re mostly linear non random access. That will fly. I have a smaller system but we do redundant reading from nearly 200 processes on it on a larger number of machines, the machines with 1gb, the storage with 10 - so it is HEAVILY random IO coming in and I use a Raid 6 now of 8 velociraptors. THat is half a gigabyte per second delivered. 256kb Stripe, Raid 6, 1gb cache on an Adaptec 71605Q. SSD as cache available but not active for that group ;)

A lot depends on read patterns.

But stay away from Raid 5 for those large drives. That is gambling the data - unless you can live without the Raid (during a full rebuild when the raid blows during a rebuild due to a drive failure) and have another backup source (like tapes). YOu can basically expect a problem with that many 4tb drives, mathematically.

I've changed 4TB to 6TB, I mean 3x 2TB HDD. Sorry for that, but thanks for your post, I will stick with RAID 6. — Wiggler Jtag, Jan 28 '14 at 16:50
How you plan handling backups? Note that backups are also for - user errors. customer of ours managed to wipe a group workspace during end of year cleanup - and had refused backusp. That was a TON of money lost in reconstructing data (still going on). — TomTom, Jan 28 '14 at 16:53
Data are not that important, so I do not plan to create backups. I will go happily with RAID 6. Anyway just a curious question, if one disk totally fails, how does RAID 6 stand to restore that disk? How much time is it needed to restore 2TB disk? Why is better to have 2 parity disks instead of just one (as RAID5)? Are those 2 parity disks in RAID 6 totally the same? Just to have a copy if one of them fails? Or do they hold different informations? — Wiggler Jtag, Jan 28 '14 at 17:54
They are not the same - but basically Raid 6 can rebuild the data from the rest. How long it takes depends on the disc, the raid busystem and configuration and load (if there is a lot of io, the rebuild may take longer). Not sure whether the parity discs are absolutely identical - I do not think. After all, the 2 failing discs may be non-parity, so I think it is just other maths. For an admin that is an implementation detail, though - I never programmed raid so... no idea. — TomTom, Jan 28 '14 at 18:06

RAID 5 30TB File storage - filesystem and strip size on large files

1 Answers1