MD RAID Array Often Starts Without All Drives

Question

I find that, when I start/restart my system, my MD array often doesn't start with all of the drives even though the device-names haven't changed at all.

I'll have to stop the array and reassemble it, and then it will start and mount like it should. Why is it so willing to start without all of the devices? It's been this way for about a year (it only gets restarted every month or two).

There doesn't appear to be any MD-related logging from before the shutdown but this appears to be from during the startup:

May 24 11:48:20 dustinhub mdadm[1843]: DeviceDisappeared event detected on md device /dev/md/0
May 24 11:54:38 dustinhub kernel: md: md0 stopped.
May 24 11:54:38 dustinhub kernel: md: unbind<sdh1>
May 24 11:54:38 dustinhub kernel: md: export_rdev(sdh1)
May 24 11:54:38 dustinhub kernel: md: unbind<sdi1>
May 24 11:54:38 dustinhub kernel: md: export_rdev(sdi1)
May 24 11:54:38 dustinhub kernel: md: unbind<sdc1>
May 24 11:54:38 dustinhub kernel: md: export_rdev(sdc1)
May 24 11:54:38 dustinhub kernel: md: unbind<sdb1>
May 24 11:54:38 dustinhub kernel: md: export_rdev(sdb1)
May 24 11:54:38 dustinhub kernel: md: unbind<sdd1>
May 24 11:54:38 dustinhub kernel: md: export_rdev(sdd1)
May 24 11:54:39 dustinhub kernel: md: md0 stopped.
May 24 11:54:39 dustinhub kernel: md: bind<sdf1>
May 24 11:54:39 dustinhub kernel: md: bind<sdg1>
May 24 11:54:39 dustinhub kernel: md: bind<sdh1>
May 24 11:54:39 dustinhub kernel: md: bind<sdi1>
May 24 11:54:39 dustinhub kernel: md: bind<sde1>
May 24 11:54:39 dustinhub kernel: md/raid:md0: device sde1 operational as raid disk 0
May 24 11:54:39 dustinhub kernel: md/raid:md0: device sdi1 operational as raid disk 4
May 24 11:54:39 dustinhub kernel: md/raid:md0: device sdh1 operational as raid disk 3
May 24 11:54:39 dustinhub kernel: md/raid:md0: device sdg1 operational as raid disk 2
May 24 11:54:39 dustinhub kernel: md/raid:md0: device sdf1 operational as raid disk 1
May 24 11:54:39 dustinhub kernel: md/raid:md0: allocated 0kB
May 24 11:54:39 dustinhub kernel: md/raid:md0: raid level 5 active with 5 out of 5 devices, algorithm 2
May 24 11:54:39 dustinhub kernel: md0: detected capacity change from 0 to 12002359508992
May 24 11:54:39 dustinhub mdadm[1843]: NewArray event detected on md device /dev/md0
May 24 11:54:39 dustinhub kernel:  md0: unknown partition table

Any chance on shutdown/restart the md array does not get stopped cleanly? Any errors on screen? Any changes made to the init scripts? — Dan, May 24 '15 at 18:05
@Dan Are you asking whether it was a hard shutdown, whether it was unmounted cleanly, or whether there was some uncleanliness after unmount and before stop? I can't say for sure. Just this last night my system was shutdown unexpectedly (I suspect the UPS has started to fail in the last day) but all of the other times I've only done proper shutdowns. I *did* find some MD logging during the startup. I added it above. — Dustin Oprea, May 24 '15 at 18:20
A log when the md does not start with all drives would be nice — Dan, May 25 '15 at 19:15
I included all of the pertinent, MD-related logs in the question, above. As mentioned, that's from the startup when the array was automatically, but incompletely, assembled. — Dustin Oprea, May 25 '15 at 23:43
Is your array listed in /etc/mdadm/mdadm.conf? Are the partitions which compose the raid of type 0xfd (linux raid autodetect)? — Dan, May 27 '15 at 17:55

MD RAID Array Often Starts Without All Drives

0 Answers0