Replacing the drives in a RAID5. sdb
failed, replaced it. Things went well. Now I'm replacing sda
, but after being added to the array it shows as a spare and doesn't sync.
Steps followed:
mdadm --manage /dev/md127 --fail /dev/sda1
.mdadm --manage /dev/md127 --remove /dev/sda1
.- Turn off system. Replace
sda
drive. - Partition as gpt using
parted
with raid flag. mdadm --manage /dev/md127 --add /dev/sda1
.cat proc/mdstat
to check that we're syncing. We're not.
I can't figure out why we're not syncing. Any help would be appreciated. Output is shown below.
RAID details:
[me@me /]# mdadm --detail /dev/md127
/dev/md127:
Version : 1.1
Creation Time : Mon Oct 22 16:20:37 2012
Raid Level : raid5
Array Size : 1953518592 (1863.02 GiB 2000.40 GB)
Used Dev Size : 976759296 (931.51 GiB 1000.20 GB)
Raid Devices : 4
Total Devices : 4
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Mon Nov 7 18:33:10 2016
State : clean, degraded
Active Devices : 3
Working Devices : 4
Failed Devices : 0
Spare Devices : 1
Layout : left-symmetric
Chunk Size : 512K
Delta Devices : 1, (3->4)
Name : meme:0
UUID : 28cf18b6:b05b9701:5d28754b:c387cb95
Events : 247368
Number Major Minor RaidDevice State
3 8 49 0 active sync /dev/sdd1
2 0 0 2 removed
4 8 33 2 active sync /dev/sdc1
6 8 17 3 active sync /dev/sdb1
5 8 1 - spare /dev/sda1
MD stat:
[me@me /]# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md127 : active raid5 sda1[5](S) sdc1[4] sdd1[3] sdb1[6]
1953518592 blocks super 1.1 level 5, 512k chunk, algorithm 2 [4/3] [U_UU]
bitmap: 4/8 pages [16KB], 65536KB chunk
unused devices: <none>
Update
There is data on the RAID. It's currently mounted and being used. I don't mind disabling it while I'm replacing the drives, but I'd like to preserve the data on the array.
Update
Worked around the headache. rsync
ed the contents of the degraded array to a backup drive, replaced the RAID with fresh-baked drives, then rsync
ed back. In the process, we learned the value of scheduling automated incremental backups (as opposed to the schedule being based on "Hey when was the last time we backed up the system? Oh God do it now!")