-1

4-Drive Software RAID1 vs RAID10 heading tells you what I am contemplating about.

Hardware: 2x 1TB Enterprise-class HDDs + 2x 1TB Consumer-class HDDs.

OS and Software: Linux Debian Jessie (stable) with mdadm.

Intended purpose: Extreme reliability storage. Cannot afford data loss. Such thing would simply be unacceptable. That is why I am considering RAID1 instead of RAID10, because fault tolerance with RAID1 should be 3-drive failure.

I see one downside: limiting the global storage size to 1/4. Crazy.

Apart from this decision RAID1 vs RAID10, which I probably already made, RAID1 that is, unless you advise me otherwise, I have a question regarding RAID1:

Supposing I am limited to 4 drives, I would have limited posibilities with RAID10, as opposed to RAID1, where I could define 3 drives active and 4th as a spare. Either that or directly define active 4 HDDs.

Please tell me what you think?

  • 6
    Repeat after me: "RAID is not a backup. RAID is not a backup." RAID is not used to protect against data loss - it's used to protect against downtime. You have many more SPOFs to deal with other than your drives. Instead of worry about "extreme reliability", perhaps worry about your backup system first. – EEAA Oct 21 '16 at 15:41
  • @EEAA yeah you like this sentence I get it. Although it backup solution is not part of this question at all. – Vlastimil Burián Oct 21 '16 at 15:54
  • 2
    Sometimes the answer to people's questions here is not what they want to hear. But that's the way it is: we're in the business of helping people use technology to achieve their goals and to build supportable, reliable systems. In this case you said "Cannot afford data loss". Understood. To address that, you put in place a solid backup system, not RAID. – EEAA Oct 21 '16 at 15:56
  • 2
    Not to be blunt but is the data not worth the extra between consumer HDD and Enterprise class? Reliability is one of the biggest reasons for the extra cost. So start off with 4 High Quality HDDs and then look at which RAID – Drifter104 Oct 21 '16 at 16:05
  • Why is everyone pointing me to backups, this is a question about RAID setup. I have decided I will go for a big compromise, I can't say I totally like it. But RAID6 will be suitable for our purposes. Thanks everyone for comments and answers. – Vlastimil Burián Oct 22 '16 at 01:57
  • You said yourself that you "Cannot afford data loss. Such thing would simply be unacceptable." That is why everyone is telling you that you need proper backups. – Michael Hampton Oct 23 '16 at 17:45

4 Answers4

8

In such a setup (4 disks and RAID1 only), it is better to directly use the 4th disk as an array member rather than as a spare.

Using it as a spare will not buy you anything on the redundancy side, while using the 4th disk as a full array member increases your redundancy from 3 to 4 copies, enabling you to survive 3-disks failures.

Anyway, if you are so much concerned about data redundancy/availability to afford to lose 3/4 of your raw space, you are probably approaching the problem from the wrong side.

Remember: RAID is not a backup!!!

Rather then increasing your RAID1 setup over 3-way mirror, please be sure to have a strong backup/recovery plan.

shodanshok
  • 47,711
  • 7
  • 111
  • 180
4

If you are looking for a system with high availability and you are worried about crashing drives, the RAID1 is surely the best solution. But if you want more space, a RAID6 might be a compromise. You "loose" the space of two disks to parity, but you are save for up to two failed disks as well.

If high availability is really your concern, you should maybe think about a synced second server as well. If data loss is your concern, than you should primarily make sure to have a good backup. A RAID is never a substitute for backups, since it only secures against failed disks, not against accidental deletion of data, malware or attacker encrypting or deleting your data and so on.

Jakob Lenfers
  • 114
  • 1
  • 11
2

Agreed, if you do have so strict requirements for storage, I would also recommend to go multinode approach. Currently we're running 2 nodes backup repository with RAID 10 arrays on each of the server. Looks stable and redundant.

batistuta09
  • 8,981
  • 10
  • 23
0

If you're so incredibly concerned about data availability and integrity, and you're willing to do something like a four member RAID-1 to get it, then you should probably be looking at redundancy at the node level.

No matter how many disks you put in a controller, there is still a single point of failure, and that is the machine itself. Rather than worrying about packing in redundancy at the RAID level, you could implement something like DRDB, GlusterFS, or Ceph.

DRDB would act more like a network RAID-1, to describe it simplistically. Gluster and Ceph can behave this way as well, but can also scale massively by both replicating to nodes and distributing data across replica sets.

You can still implement RAID at the node level using these types of storage, but with these inter-node replicating systems it becomes much less of a concern and reduces scalability in larger deployments. It's also easy to take an entire node out of the cluster, fix it, and then put it back in. In storage clouds, RAID is being used less and less often for these reasons.

Spooler
  • 7,046
  • 18
  • 29
  • Totally agree with the node level redundancy. Once an exploding power supply toasted both of my RAID1 disks. – Michuelnik Oct 21 '16 at 18:18