mdadm: power failure during raid1 rebuild

Question

RAID1 array created with mdadm; array was rebuilding when there was a power failure. On coming back up, the array seems to be dead.

# cat /proc/mdstat
Personalities : [raid1]
md0 : inactive sdb1[2](S) sda1[3](S)
15627751424 blocks super 1.2

# mdadm --stop /dev/md0
mdadm: stopped /dev/md0
# mdadm --assemble /dev/md1 /dev/sda1 /dev/sdb1
mdadm: /dev/md1 assembled from 0 drives and 1 rebuilding - not enough to start the array.

I don't know why it says 0 drives - I expected the 1 rebuilding but the other drive should have been fine. I really don't want to accidentally do anything destructive - what should I do to get the array back together to a state I can run fsck on it?

/dev/sda1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x3
     Array UUID : c4ea2289:c63bc8ce:e6fe5806:5bebe020
           Name : ******:0  (local to host ******)
  Creation Time : Thu Aug 20 20:48:53 2020
     Raid Level : raid1
   Raid Devices : 2

 Avail Dev Size : 15627718656 (7451.88 GiB 8001.39 GB)
     Array Size : 7813859328 (7451.88 GiB 8001.39 GB)
    Data Offset : 264192 sectors
   Super Offset : 8 sectors
Recovery Offset : 7331083264 sectors
   Unused Space : before=264112 sectors, after=0 sectors
          State : clean
    Device UUID : ae984a41:f3e421f4:f10e1fac:d7955178

Internal Bitmap : 8 sectors from superblock
    Update Time : Sat Oct 31 17:13:53 2020
  Bad Block Log : 512 entries available at offset 40 sectors
       Checksum : 4be47968 - correct
         Events : 1079809


   Device Role : Active device 0
   Array State : AA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdb1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : c4ea2289:c63bc8ce:e6fe5806:5bebe020
           Name : ******:0  (local to host ******)
  Creation Time : Thu Aug 20 20:48:53 2020
     Raid Level : raid1
   Raid Devices : 2

 Avail Dev Size : 15627784192 (7451.91 GiB 8001.43 GB)
     Array Size : 7813859328 (7451.88 GiB 8001.39 GB)
  Used Dev Size : 15627718656 (7451.88 GiB 8001.39 GB)
    Data Offset : 264192 sectors
   Super Offset : 8 sectors
   Unused Space : before=264112 sectors, after=65536 sectors
          State : active
    Device UUID : 0abf70e2:7e0e43ca:a22548a2:ef87e9c0

Internal Bitmap : 8 sectors from superblock
    Update Time : Sat Oct 31 17:13:31 2020
  Bad Block Log : 512 entries available at offset 40 sectors
       Checksum : 4cf65df4 - correct
         Events : 1079799


   Device Role : Active device 1
   Array State : AA ('A' == active, '.' == missing, 'R' == replacing)

Looking at the drives with xxd, both do still seem to contain data. Is this something I could reasonable manage myself or is it time for a data recovery company?

Edit: Wazoox's answer worked.

#  mdadm --assemble --scan --force
mdadm: forcing event count in /dev/sda1(1) from 1079799 upto 1079809
mdadm: /dev/md0 has been started with 1 drive (out of 2) and 1 rebuilding.
# cat /proc/mdstat
Personalities : [raid1]
md0 : active (auto-read-only) raid1 sdb1[3] sda1[2]
      7813859328 blocks super 1.2 [2/1] [_U]
      bitmap: 59/59 pages [236KB], 65536KB chunk

wazoox · Accepted Answer · 2020-11-02T14:33:51.870

4

Both your drives appear marked as "spare". Stop the array then try reassembling it with the scan option:

  mdadm --stop /dev/md0
  mdadm --assemble --scan

If that doesn't work, try

  mdadm --stop /dev/md0
  mdadm --assemble --scan --force

Check that everything is back on line. Also please report what /proc/mdstat looks afterwards.

If you want to be on the safe side, disconnect one of the drives beforehand, and start the array with only one drive. There are only 10 events of difference between the two drives, so they're probably both OK but for a quick fsck.

edited Nov 02 '20 at 14:33

answered Nov 02 '20 at 14:30

wazoox

6,918
4
31
63

Thanks. Is there any possibility of this being a destructive action if there is any weird state going on? If so I might just give up and contact a data recovery company. – user5416 Nov 02 '20 at 14:32
1

@user5416 very improbable. There are only 10 events of difference between the two drives; both are almost certainly in a usable state, needing only a fsck. – wazoox Nov 02 '20 at 14:35
@user5416 try not to yank the power while it rebuilds this time :) – wazoox Nov 02 '20 at 14:54
Yeah, the problem being the UPS ran out during the shutdown. Load on the UPS reduced and trigger happiness of the shutdown increased. – user5416 Nov 02 '20 at 15:00

mdadm: power failure during raid1 rebuild

1 Answers1