0

Lost RAID

After rebooting AWS, I have lost the RAID (mdadm), while the disks seem to be OK (separately).

Symptoms

mdadm not working

After machine restart (stop and start on Amazon AWS), the device /dev/md0 is not working:

[11:52:17 root :)  ]$ cat /proc/mdstat 
Personalities : 
unused devices: <none>

[12:03:09 root :) ]$ mdadm -A /dev/md0 
mdadm: no devices found for /dev/md0

Disks are still considered a part of a RAID

But all the raid disks do seem to be a part of a RAID:

[12:05:24 root :) ]$ mdadm -Q /dev/sdk
/dev/sdk: is not an md array
/dev/sdk: device 7 in 8 device undetected raid0 /dev/md0.  
          Use mdadm --examine for more detail.

And:

[11:51:40 root :) ]$ mdadm -mdadm --create /dev/md0 --level=0 --raid-devices=8 
              --chunk=1024 /dev/sd[defghijk]
mdadm: /dev/sdd appears to be part of a raid array:
    level=raid0 devices=8 ctime=Wed Oct 27 10:38:53 2010
mdadm: /dev/sde appears to be part of a raid array:
    level=raid0 devices=8 ctime=Wed Oct 27 10:38:53 2010
mdadm: /dev/sdf appears to be part of a raid array:
    level=raid0 devices=8 ctime=Wed Oct 27 10:38:53 2010
mdadm: /dev/sdg appears to be part of a raid array:
    level=raid0 devices=8 ctime=Wed Oct 27 10:38:53 2010
mdadm: /dev/sdh appears to be part of a raid array:
    level=raid0 devices=8 ctime=Wed Oct 27 10:38:53 2010
mdadm: /dev/sdi appears to be part of a raid array:
    level=raid0 devices=8 ctime=Wed Oct 27 10:38:53 2010
mdadm: /dev/sdj appears to be part of a raid array:
    level=raid0 devices=8 ctime=Wed Oct 27 10:38:53 2010
mdadm: /dev/sdk appears to be part of a raid array:
    level=raid0 devices=8 ctime=Wed Oct 27 10:38:53 2010

mdadm.conf (for reference)

[11:53:10 root ]$ cat /etc/mdadm/mdadm.conf
# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default, scan all partitions (/proc/partitions) for MD superblocks.
# alternatively, specify devices to scan, using wildcards if desired.
DEVICE partitions

# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes

# automatically tag new arrays as belonging to the local system
HOMEHOST <system>

# instruct the monitoring daemon where to send mail alerts
MAILADDR root

# definitions of existing MD arrays
ARRAY /dev/md0 level=raid0 num-devices=8 UUID=a6c665f4:650c70af:7c32f52b:1d49233e
coredump
  • 12,713
  • 2
  • 36
  • 56
Adam Matan
  • 13,194
  • 19
  • 55
  • 75

1 Answers1

1

OK, some hotshot from my office outwitted me.

Here goes:

The problem caused due to wrong configuration:

The file /etc/mdadm/mdadm.conf indicates the device UUID:

ARRAY /dev/md0 level=raid0 num-devices=8 
               UUID=a6c665f4:650c70af:7c32f52b:1d49233e

I checked the actual UUID of the devices (which is different):

[12:15:30 root :) ]$ vol_id /dev/sde | grep ID_FS_UUID=
ID_FS_UUID=575fee91:786ac78e:8ffa4ee6:5eade1eb
[12:17:11 root :) ]$ vol_id /dev/sdf | grep ID_FS_UUID=
ID_FS_UUID=575fee91:786ac78e:8ffa4ee6:5eade1eb

After changing the configuration file, you should run the mdadm in re-creation mode:

[12:13:01 root :) ]$ mdadm -A /dev/md0 
mdadm: /dev/md0 has been started with 8 drives.
Adam Matan
  • 13,194
  • 19
  • 55
  • 75