6

I have (had) a RAID 1 array (2 disk mirror) and one of the disks, sda, failed. So I've replaced the bad disk with a new one, but seem to be stuck on how to get the second drive back up and running as part of the array.

The system is running Ubuntu Server 9.04 and was configured as follows:

MD0 => sda1,sdb1

MD1 => sda3,sdb3

MD2 => sda2,sdb2

 mdadm --detail /dev/md0

shows two drives:

0 /dev/sdb1 "Active Sync"

1 [nothing] "Removed"

MD1 and MD2 look the same.

The tutorial I found says to mark each partition as failed using the command:

mdadm --manage /dev/md0 --fail /dev/sda1

But, since the drive is not there, I get:

mdadm: cannot find /dev/sda1: No such file or directory

Can I skip the failing step? Or is there some other way to fail a partition that's no longer present? Or if I copy the partition table from the good old drive to the new one, will it automatically pick up that it's the replacement?

I'm new at this and don't want to screw it up. :)

Nick
  • 4,503
  • 29
  • 69
  • 97

2 Answers2

6

You shouldn't need to fail them. Since they should have already been failed when you first noticed the issue and the RAID members are now removed. There are just a few steps to get it back up and running.

  1. Setup partitions on the replacement disk. These partitions should be identical in size to that of the failed and currently active disk, and should be marked as partition type "Linux RAID Autodetect" (0xFD). You can simplify this by copying the partition table with sfdisk.

    sfdisk -d /dev/sdb | sfdisk /dev/sda
    
  2. If the disk has been used before then you may want to ensure that any existing softRAID information is removed before you begin again.

    mdadm --zero-superblock /dev/sda
    
  3. Install an MBR onto the new disk so that it is bootable. Do this from the grub shell. Assumes that /dev/sda is the first disk.

    root (hd0,0)
    setup (hd0)
    quit
    
  4. Add new partitions back to the arrays.

    mdadm --add /dev/md0 /dev/sda1
    mdadm --add /dev/md1 /dev/sda3
    mdadm --add /dev/md2 /dev/sda2
    
  5. Monitor the status of their reconstruction by viewing /proc/mdstat. You can automate this with.

    watch -n10 cat /proc/mdstat
    
Dan Carley
  • 25,617
  • 5
  • 53
  • 70
  • Thanks Dan - Output from the first command included: "Error: sector 0 does not have an msdos signature /dev/sda:unrecognised partition table type" Is that anything to worry about? – Nick Dec 27 '09 at 13:01
  • 1
    It just indicates that the new disk didn't have an existing partition table when `sfdisk` attempted to read from it. Nothing to worry about. Just double check that the two outputs of `fdisk -l /dev/sdb /dev/sda` look the same before you proceed. – Dan Carley Dec 27 '09 at 13:10
  • One small typo: "mdadm --add /dev/md2 /dev/sda1" should be "mdadm --add /dev/md2 /dev/sda2". It looks like it's working- MD0 is marked as [UU] and MD1 says "recovery = 4.2%". So once all are back to [UU] status everything is good? Are there any other steps needed to confirm normal functionality? Thanks! – Nick Dec 27 '09 at 13:12
  • 1
    Whoops - fixed that. Yep, once the arrays are done syncing and all marked `UU`, then you're back to normal. If the machine is taking a long time to sync and isn't doing anything important then you may wish to speed up the process with the tweaks described here - http://www.ducea.com/2006/06/25/increase-the-speed-of-linux-software-raid-reconstruction/ - otherwise just sit back and relax. – Dan Carley Dec 27 '09 at 13:19
  • Cool, thanks! I adjusted the speed limit/min a little, but I'm not sure it's going any faster. It's Sunday, so I'm in no hurry. :) If I wanted to add a third mirrored drive, can I just repeat the steps above, substituting sdc for sda? – Nick Dec 27 '09 at 13:51
  • You can't throw a 3rd disk in as an active mirror because RAID1's functionality is limited to 2 array members. It would be used as a spare instead. – Dan Carley Dec 27 '09 at 16:52
  • Looks like the arrays are back in sync, but it won't boot off of the new drive. It goes to a single blinking cursor in the top left corner when grub is supposed to load. If I set the other drive as the first boot device it boots fine. Do I need to set a bootable flag on the new drive? – Nick Dec 27 '09 at 16:54
  • Well done on you 10K Dan, good timing :) best wishes, Phil. – Chopper3 Dec 28 '09 at 22:54
  • Nick> Ah, I've added step #3. That should do it. Phil> Thanks! :) – Dan Carley Dec 29 '09 at 10:09
3

Check http://techblog.tgharold.com/2009/01/removing-failed-non-existent-drive-from.shtml. Use

mdadm /dev/mdX -r detached

  • Welcome to Server Fault! Whilst this may theoretically answer the question, [it would be preferable](http://meta.stackexchange.com/q/8259) to include the essential parts of the answer here, and provide the link for reference. – Scott Pack Nov 01 '12 at 19:10
  • +1 - the proposed command solved my problem too – bbonev Nov 07 '14 at 04:03