1

I got disk failure on my Centos Linux soft raid 5 array (mdadm). I replaced one of the disk and started to rebuild the array. Next time I checked the status, the rebuild was failed.

This is the status right now:

[root@localhost ~]# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : inactive sdc1[3](S) sdd1[2] sdb1[0]
      4883277760 blocks

unused devices: <none>

-

[root@localhost ~]# mdadm --detail /dev/md0
/dev/md0:
        Version : 0.90
  Creation Time : Mon Aug 23 22:37:36 2010
     Raid Level : raid5
  Used Dev Size : 1465135936 (1397.26 GiB 1500.30 GB)
   Raid Devices : 3
  Total Devices : 3
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Tue Jan  1 23:30:32 2002
          State : active, degraded, Not Started
 Active Devices : 2
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 1

         Layout : left-symmetric
     Chunk Size : 64K

           UUID : 6af06755:6fc93cba:c083764e:1e719c94
         Events : 0.27470

    Number   Major   Minor   RaidDevice State
       0       8       17        0      active sync   /dev/sdb1
       1       0        0        1      removed
       2       8       49        2      active sync   /dev/sdd1

       3       8       33        -      spare   /dev/sdc1

/dev/sdc is the brand new drive. If I try to remove it and add again, it still stays in spare. How should I try to start rebuild this?

devha
  • 111
  • 1
  • Could you try to remove failed/detached devices with 'mdadm /dev/md0 --remove failed' and 'mdadm /dev/md0 --remove detached' and to dump details again? – dsmsk80 Feb 24 '14 at 19:33
  • Tried to remove failed and detached disks but no change to md0 details. Btw. im in read only filesystem, this should not affect? – devha Feb 24 '14 at 20:01
  • So the line with "removed" device is still there? – dsmsk80 Feb 24 '14 at 20:06
  • Yes, no change to mdadm --detail /dev/md0 output. – devha Feb 24 '14 at 20:10

1 Answers1

1

How should I try to start rebuild this?

Depends on if you care about the data. Assuming you do, there is a guide over here. Note that RAID-5 has some issues, which were enough for me to convert most of our environment to RAID-10 or just plain mirrors.

I'm not sure what /dev/md0 is, but if your root (/) filesystem is on there (or any filesystem you're distribution would need, like /usr for the utilities), then I would suggest you get a Live-Boot CD for your Linux Distribution and boot-off of that before attempting a repair.

Once you've rebooted off of a Boot-CD, run the following to find your array;

mdadm --assemble --scan

From there, you can either follow the guide above in an attempt to safely recover the RAID-5 array. Or you can nuke and rebuild the entire thing.

Note that you might want to consider a more redundant configuration, such as RAID-1 & a spare. (Unless you can get a 4th drive, then RAID-10)

Signal15
  • 952
  • 7
  • 29