0

So I just acquired this HP Proliant DL320 G5 Server with two 160GB drives setup in a Raid 1 array. I'm planning sticking this in a colo and installing Proxmox (openvz server) and running two debian-based webservers (django + apache + nginx) on them. Since these will both be semi-low traffic production servers, uptime is my only real concern. I'm debating between keeping the raid 1 setup, or not and using the extra disk space for virtual machine backups. However, I'm a total noob when it comes to raid and have no idea what the recovery process would look like. The HP Raid utilities and options in the firmware look fairly straight forward thought, and I consider myself a fairly technical user, this is just my first dealings with raid.

Should I keep the raid 1 setup? If one of the drive fails, what will be required for a full recovery? Any advice or recommended reading?

Thanks in advance.

2 Answers2

1

Not sure there's a right answer to this one tbh.

RAID is not backup, RAID is redundancy so it'll help you if a drive fails, but if you delete something you didn't want to you still need a backup so you need to consider how you would do that if you didn't have a spare drive in the server.

I've not used a DL320 G5 specifically, but assuming it's like other Smart Array controllers you should find it's just a case of plugging in the new drive.

flooble
  • 2,364
  • 8
  • 28
  • 32
1

If you keep the R1 config, if one drive fails the recovery procedure is to hot-swap the bad drive for a good one. The card will handle rebuild for you, you don't need to do anything. There will be no downtime.

If you don't keep the R1 config, you will be at risk for complete data loss in the case of a bad drive. If your OS/boot drive goes out, having the 2nd one for backups isn't going to help a whole lot until you get an OS back on there. Recovery will take hours to days, depending on how fast you can get to the colo and do the recovery work.

If uptime is your prime concern, keep the R1 config.

sysadmin1138
  • 133,124
  • 18
  • 176
  • 300
  • This is what I'm leaning toward. What's the most reliable to test this in a lab setup? –  Sep 23 '10 at 19:50
  • @karlmw This isn't the kind of thing you can test. Your best bet is to figure out how fast you can get to the colo with new hardware, and then *in the lab* figure out the relative recovery times. Not having R1 **will** mean downtime in the case of a bad drive. Using R1 greatly minimizes it; you'll still be vulnerable if the other disk goes before you can get to the colo to add the new drive, but that's a small risk relatively. – sysadmin1138 Sep 23 '10 at 19:55
  • @sysadmin1138 Oh ok, so my confusion lies in the failure scenario. If the drive fails, then will the system remain functioning/switch over the other drive on it it's own automatically (meaning the sites will stay up). –  Sep 23 '10 at 20:51
  • @Karlmw That's it exactly. In R1 the server remains up when one drive fails, it fails over to the 2nd drive seamlessly. – sysadmin1138 Sep 23 '10 at 21:47
  • @sysadmin1138 Last question I promise. How would one monitor (especially in a colo) for the initial drive failure in a hardware raid scenario ? –  Sep 23 '10 at 21:57
  • @karlmw HP has a few agents that can do this. If you install their System Management Homepage and associated agents I *believe* there is a notification utility for firing off things like emails. – sysadmin1138 Sep 23 '10 at 22:03