0

I inherited a machine that was running Debian with a RAID 5 array. I installed a bunch of updates (1700 or so) that the OS recommended, then after rebooting, the raid array did not mount. The device /dev/md0 now does not exist, and I do not know why.

The /etc/mdadm/mdadm.conf contains:

DEVICE partitions
ARRAY /dev/md0 level=raid5 num-devices=3 UUID=138b0c65:20644731:39e394c4:192c7227

I tried to do mdadm --create --verbose /dev/md0 --level=5 --raid-devices=3 /dev/sda1 /dev/sdb1 /dev/sdc1 . This makes a device md0, but it is listed as "degraded," and the last drive in the list is for some reason considered "spare." I strongly suspect, though I can not be sure, that it was sdb, sdc, and sdd that were involved in the RAID-5 array

I tried all 6 possible orderings of the devices, but the last one would always come up spare. I also tried --spare-devices=0 --force, which successfully got all three drives into the array with a "clean" status, but I was unable to actually mount the device md0. When I run "file -s" on /dev/md0, I get GLS_BINARY_LSB_FIRST, which seems unhelpful.

I have no reason to believe any of the devices are faulty; all of this seems to stem from the recent upgrading. How can I resurrect the old RAID 5 array? Have my --create machinations somehow messed it up? Note that I have never successfully mounted md0.

Please advise. I know this is always the story, but I am in big trouble if I can't resurrect this thing, so anyone who helps has my eternal gratitude, for whatever it's worth.

2 Answers2

5

I suspect you may be testing out your restore procedures in the near future.

Running --create on an existing array is... well, "misguided" is about as nice a spin as I can put on it. That's intended only to create a new array -- which you specifically don't want to do.

What you want is --assemble, or even better, work out why the system decided to stop automatically assembling the array on boot. The fact that you've been creating arrays all over the place has probably hosed any chance you had of the array ever working again, though.

By the by, the reason why your newly-created RAID5 array starts in degraded mode is given in the mdadm manpage (which would have been a good first thing to read):

When creating a RAID5 array, mdadm will automatically create a degraded array with an extra spare drive. This is because building the spare into a degraded array is in general faster than resyncing the parity on a non-degraded, but not clean, array. This feature can be overridden with the --force option.

As far as diagnosing why your RAID array wasn't assembling after the upgrade, a glance at dmesg probably would have put you right; unfortunately, that data is probably either (a) gone, or (b) at least, thoroughly irrelevant now.

An mdadm ninja might be able to put things back to normal, if they had access to the machine in question and a lot of time and patience, but in your case, I'd recommend just taking the downtime hit and restoring from backup.

womble
  • 96,255
  • 29
  • 175
  • 230
2

Your create machinations have likely messed it up.

--create initializes a new array.

You wanted --assemble.

Did --create give you any warnings which you overrode?

Please paste the output of mdadm -E on each partition. It may be recoverable.

MikeyB
  • 39,291
  • 10
  • 105
  • 189
  • Ah. I never overrode any warnings. The closest I came was trying "force" so that sdd did not get used as "spare." I can't fit the output of mdadm -E here. Where should I put it? – Diogenes Creosote Aug 13 '11 at 00:09
  • To summarize the outputs of mdadm -E, each "Array Size" is 465.76 GiB. The checksums are all different, but all listed as "correct." Array state is AAA. Is there other info I can give in this limited space that would be helpful? – Diogenes Creosote Aug 13 '11 at 00:16
  • Nope. Not helpful. Put the complete output of running it on all partitions (not just the ones you think are right - you listed at least 4 drives, I want to see them *all*) up on pastebin. – MikeyB Aug 13 '11 at 00:24
  • Here it is: http://pastebin.com/c2Gpn27y – Diogenes Creosote Aug 13 '11 at 00:32
  • My listing sda was a typo; sda is mounted as the root file system, so it is certainly not involved in the raid array – Diogenes Creosote Aug 13 '11 at 00:36
  • I was hoping for your sake you'd have other partitions that magically contained your data, but nope. Time to restore from backup. And next time, remember HHGTTG's first rule of data recovery: Don't Panic. – MikeyB Aug 13 '11 at 01:11
  • I inherited this box, and I was given no backup. I do not need everything on this drive. Even recovering half of it would help immensely. Surely this --create process can't have overwritten 500GB of data. How would you recommend trying to salvage what I can? – Diogenes Creosote Aug 13 '11 at 01:18
  • Don't put diagnostic data in a pastebin in a comment. **EDIT YOUR QUESTION**. Also, if you "inherit" a server, the first thing you do is a service assessment, and remedy any defects (such as, for instance, not having freaking backups). That goes double when you then proceed to do an OS upgrade... Finally, as far as salvaging data goes, find an MD ninja, give them a lot of money, and cross your fingers. – womble Aug 13 '11 at 01:38
  • What womble said. I wouldn't quite call myself a md ninja but everything I see here says that it's pretty screwed up. At the bare minimum it'll take someone who has intimate experience with md and data recovery to sit down and spend a LOT of time with this machine. Contact a professional data recovery firm and be prepared to shell out a hefty fee. – MikeyB Aug 13 '11 at 17:59