Prelude
I had the following devices in my /dev/md0
RAID 6: /dev/sd[abcdef]
The following drives were also present, unrelated to the RAID: /dev/sd[gh]
The following drives were part of a card reader that was connected, again, unrelated: /dev/sd[ijkl]
Analysis
sdf
's SATA cable went bad (you could say it was unplugged while in use), and sdf
was subsequently rejected from the /dev/md0
array. I replaced the cable and the drive was back, now at /dev/sdm
. Please do not challenge my diagnosis, there is no problem with the drive.
mdadm --detail /dev/md0
showed sdf(F)
, i.e., that sdf
was faulty. So I used mdadm --manage /dev/md0 --remove faulty
to remove the faulty drives.
Now mdadm --detail /dev/md0
showed "removed" in the space where sdf
used to be.
root@galaxy:~# mdadm --detail /dev/md0 /dev/md0: Version : 1.2 Creation Time : Wed Jul 30 13:17:25 2014 Raid Level : raid6 Array Size : 15627548672 (14903.59 GiB 16002.61 GB) Used Dev Size : 3906887168 (3725.90 GiB 4000.65 GB) Raid Devices : 6 Total Devices : 5 Persistence : Superblock is persistent Intent Bitmap : Internal Update Time : Tue Mar 17 21:16:14 2015 State : active, degraded Active Devices : 5 Working Devices : 5 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 512K Name : eclipse:0 UUID : cc7dac66:f6ac1117:ca755769:0e59d5c5 Events : 67205 Number Major Minor RaidDevice State 0 8 0 0 active sync /dev/sda 1 8 32 1 active sync /dev/sdc 4 0 0 4 removed 3 8 48 3 active sync /dev/sdd 4 8 64 4 active sync /dev/sde 5 8 16 5 active sync /dev/sdb
For some reason the RaidDevice of the "removed" device now matches one that is active. Anyway, let's try add the previous device (now known as /dev/sdm
) because that was the original intent:
root@galaxy:~# mdadm --add /dev/md0 /dev/sdm mdadm: added /dev/sdm root@galaxy:~# mdadm --detail /dev/md0 /dev/md0: Version : 1.2 Creation Time : Wed Jul 30 13:17:25 2014 Raid Level : raid6 Array Size : 15627548672 (14903.59 GiB 16002.61 GB) Used Dev Size : 3906887168 (3725.90 GiB 4000.65 GB) Raid Devices : 6 Total Devices : 6 Persistence : Superblock is persistent Intent Bitmap : Internal Update Time : Tue Mar 17 21:19:30 2015 State : active, degraded Active Devices : 5 Working Devices : 6 Failed Devices : 0 Spare Devices : 1 Layout : left-symmetric Chunk Size : 512K Name : eclipse:0 UUID : cc7dac66:f6ac1117:ca755769:0e59d5c5 Events : 67623 Number Major Minor RaidDevice State 0 8 0 0 active sync /dev/sda 1 8 32 1 active sync /dev/sdc 4 0 0 4 removed 3 8 48 3 active sync /dev/sdd 4 8 64 4 active sync /dev/sde 5 8 16 5 active sync /dev/sdb 6 8 192 - spare /dev/sdm
As you can see, the device shows up as a spare and refuses to sync with the rest of the array:
root@galaxy:~# cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] md0 : active raid6 sdm[6](S) sdb[5] sda[0] sde[4] sdd[3] sdc[1] 15627548672 blocks super 1.2 level 6, 512k chunk, algorithm 2 [6/5] [UU_UUU] bitmap: 17/30 pages [68KB], 65536KB chunk unused devices:
I have also tried using mdadm --zero-superblock /dev/sdm
before adding, with the same result.
The reason I am using RAID 6 is to provide high availability. I will not accept stopping /dev/md0
and re-assembling it with --assume-clean
or similar as workarounds to resolve this. This needs to be resolved online, otherwise I don't see the point of using mdadm.