RAID 5 Data Loss?

Question

Here is what happend. on a ML 350 with a perc 200i controller, raid 5 with 5 disks.

1 Disk got corrupted , and was was on predictive failure flag. On the replacement of the corrupted disk and after the rebuild the logical drives of the raid controller was not re-enabled. Then the other disk died.

Is there a way to obtain the data ?

i would like to let you know that the problem is solved. I added one more drive on the controler. installed OS and while setup was checking the raid 5 logical drive it repaired also the damage. Windows is now up and running. Thanks tho for the quick response — Cmosk, Feb 09 '12 at 06:31

score 6 · Answer 1 · answered Feb 09 '12 at 04:18

6

Is there a way to obtain the data ?

Restore from backup.

If that's not an option, pack up the drives and ship them to a data recovery shop along with a big fat check.

answered Feb 09 '12 at 04:18

EEAA

109,363
18
175
245

estimated check amount ? – Cmosk Feb 09 '12 at 04:19
can you suggest a data recovery shop? – Cmosk Feb 09 '12 at 04:21
In USD, it'll range anywhere from a few hundred, to tens of thousands, depending on the degree of damage, the type of data, and amount that needs to be recovered. – EEAA Feb 09 '12 at 04:22
No I cannot - service recommendations are off-topic here. I'm sure a google search will help you locate one in your area. – EEAA Feb 09 '12 at 04:22
If you don't have backups its even possible that a big fat check to a recovery shop won't help. I hope the lesson learned was, on raid 5 failure immediate backup! Drive #2 isn't far behind – Jim B Feb 09 '12 at 04:23
@JimB - Agreed. And lesson #2, don't use RAID5. – EEAA Feb 09 '12 at 04:23
1

@erikA, there isn't anything inherently wrong with raid 5, like every raid level there are tradeoffs. – Jim B Feb 09 '12 at 04:28
one thing more. the disk 2 was removed from the bay while the server was oparating and replaced with another. the disk was not entirly dead but predictes as failure. as soon as the disk was removed windows crashed. – Cmosk Feb 09 '12 at 04:30
as soon as i replaced the new disk with the old one it tried to boot windows but i received an error that ntoskrnl.exe was corrupted. on attempts to repair the installation (w2k3) the setup is stuck for hours on 'checking disk c:' – Cmosk Feb 09 '12 at 04:38
@JimB - of course there are tradeoffs. However, this question is a prime example of why not to use RAID5 - the risks of dual-disk failure during rebuild are quite high. – EEAA Feb 09 '12 at 04:40
@JimB The risk with either the "back up as soon as a disk fails" and the normal "shove a new disk in and rebuild it from the other disks" strategy is the same - an heretofore undiscovered unrecoverable read error on a second disk. Back up *before* the first disk fails! ;) – Shane Madden Feb 09 '12 at 04:42
@shane madden, I am not implying that regular backups shouldn't be done but rather like any major change to the environment a backup should be completed prior to changing the disk. There is no point in recovering last weeks data when you need today's data. – Jim B Feb 09 '12 at 05:18
@erika, the chance that 1 disk will die is the same as 2,3 or 4 disks dying, presuming all disks are of the same age and powered up. MTBF only requires that the drive be powered up. There is always a tradeoff between cost and acceptable risk. I'd love to have redundant sans but it's just not financially practical gor my level of risk. – Jim B Feb 09 '12 at 05:25
The risk of a second disk failure goes up with the increased load of the raid rebuild, especially with modern large disks, you can be looking at rebuild time measured in days or weeks. – EEAA Feb 09 '12 at 05:30
@JimB A few sectors going bad is not considered a failure in manufacturers' MTBF metrics, it's expected - but a parity block (which is never read until a rebuild, and thus won't be discovered as bad until the rebuild is attempted) failing is an additional risk, and the odds of such a failure during a given rebuild correlates to the sizes of the disks in use. Mind that I'm not saying "never use RAID5"; I'm saying "back up early and often; any given RAID5 rebuild is a roll of the dice; hope for success but don't count on it". – Shane Madden Feb 09 '12 at 06:31
@ShaneMadden I think we're on teh same page here, I'm simply looking at it from a pure operations perspective. Regardless of the technology involved, when the risk is server dead and or data gone, you backup before going any further – Jim B Feb 09 '12 at 12:17
@ErikA of course rebuild times matter but there is no raid level without some rebuild time- the question is how much time is acceptable. – Jim B Feb 09 '12 at 12:26

RAID 5 Data Loss?

1 Answers1