Scientific data storage: many small files, one volume or several?

Question

I have about 8 TB worth of 'sample' data with the following characteristics:

each sample: 5-15 GB in one folder containing ~20k files and ~10k subfolders (2000 top-level, 5 sub-level containing a ~.5-2MB data files and small settings files).

I am setting up a Dell T710 server running Windows server 2008 R2 with 19 TB effective space (RAID5) in order to consolidate the data. I have previously seen significant slow-downs when opening/browsing/copying on a computer with about 1.5 TB of this type of data on a dedicated internal drive (NTFS).

Each sample will be copied to this server for storage, but analysis will occur elsewhere (data copied off of server). So no daily change in existing data, just new data.

What is the best drive configuration to handle this type of data? Drive is GPT and currently has EFI, MSR, 70 GB system partition, and empty 19 TB data partition.

one large 19 TB volume
several smaller volumes (less fragmentation?)

would it be advisable to create a per-sample zip archive and store this instead? I would hesitate about this because users understand folders intuitively, and corruption has worse effects on archives -- we could afford a few corrupted sub-folders (sample 'pixels', more or less) in the extreme case, but corrupting an entire sample archive would be bad.

With that much data, I'd personally reconsider using RAID 5... — Bart Silverstrim, Jan 05 '12 at 20:17
Is this in-house software? Is there a way to put the data into a database rather than relying on the filesystem? — Bart Silverstrim, Jan 05 '12 at 20:20
We are using RAID5 for the 19TB volume. The sample folders are created by a commercial instrument with a fixed directory structure expected by the analysis software. — Isaiah, Jan 05 '12 at 20:25
I misunderstood: why would you reconsider RAID5 - performance or something else? The data will be mirrored to a RAID6 GlusterFS off-site, run by a central group, so I'm not considering RAID5 as the backup! — Isaiah, Jan 05 '12 at 20:33
Isaiah - R5 is frowned-up by pro sysadmins as large sets take a long time to recover from single disk failure and while they are doing this you're at risk of losing everything if another disk fails - use R6 please. — Chopper3, Jan 05 '12 at 20:51
Thanks, would one RAID6 volume be preferable to two RAID5 volumes with 6 disks each? — Isaiah, Jan 05 '12 at 20:54
Isaiah: Sure, RAID6 is better. The odds of an unrecoverable read error (URE) are 1bit in 11TB, so when you lose a drive and go to replace it, the rebuild will almost certainly cause another failure, this time probably irrevocably. RAID 6 adds another parity bit so that you can recover from a URE during rebuild. — Matt Simmons, Jan 05 '12 at 21:15
Just for comparison: The largest file systems on earth are hosted on 8+2 disk RAID6 LUNs on DataDirect Networks or LSI/NetApp hardware, with up to 300GB/s throughput and hosting several hundred million files. — pfo, Jan 05 '12 at 21:24
@MattSimmons The standard Reed-Solomon RAID6 and RAID DP implementation all run on sectors or blocks there is no "parity bit". — pfo, Jan 05 '12 at 21:30
@pfo: Right, RAID6 is erasure codes, but I really didn't feel like explaining all of that :) sorry for the intentional obfuscation :) — Matt Simmons, Jan 06 '12 at 14:19

Evan Anderson · Accepted Answer · 2012-01-05T20:47:51.783

19TB in a single RAID-5 volume is awfully big. You don't mention how many disks you have in that volume but, being in a Dell T710, I think it's highly likely that you've got more than 1TB per disk. I get antsy with RAID-5 members being that large. If that's a single RAID-5 span that's even more scary to me. (I don't like a span larger than 5 or 6 disks, especially with disks that large.)

Your choice of RAID-5 aside, in my experience that's a fairly large number of files to be asking NTFS to handle. Anything that you can do to reduce the number of files being stored is going to help performance. Compressing the "sample" as you describe would radically decrease the number of files you're asking NTFS to handle. Depending on how well your data compresses you could see significant performance increases in transferring the files over the network, as well.

In my opinion you shouldn't be worrying about "corruption" of the data. If you don't have enough faith that your backup system and primary storage will work w/o corrupting the files then you should concentrate on beefing those "foundation" components up. RAID-10 or RAID-50 would be a good first step toward beefing up the primary storage. Since you don't talk about how you're doing backup I can't really speak to that.

Edit:

I'm wary of RAID-5 for availability. The seminal article about this is Why RAID 5 Stops Working in 2009. The gist is that bit-error rates on larger disks make rebuilds of large RAID-5 volumes statistically improbable.

If you have another copy of the data off-site then it's probably less of a concern. You should think about what the ramification would be of a complete loss of the RAID-5 volume. Will you be able to spin up a new volume and continue working while you re-copy data from the off-site copy? Will you need to wait for some quantity of data to copy before work can begin again? If there is idle time what will the cost be?

Thanks, the T710 has 12x2TB disks. I would not be opposed to splitting it into two RAID5 volumes, I'll rebuild and do that. I would prefer to use a modern file system, but the system has to be supportable long-term by an all-windows in-house IT group. Backup will be to an off-site RAID6/GlusterFS system run by the central IT department (big conglomerate center..) — Isaiah, Jan 05 '12 at 20:51

Mirko Ebert · Answer 2 · 2012-01-06T16:26:48.557

You lost disk space if you have many small files. The reason is the block size of you file system. My first suggestion is to use a Linux system for long term support. And my second suggestion is to save the files without zipping on the file system because understanding the system is much more important the losing some bytes. I had the same problem with genomic data (shotgun analyzer). My third suggestion is to use a RAID10 or RAID50.

Scientific data storage: many small files, one volume or several?

2 Answers2