1

I mirrored two disks in a USB enclosure on my Arch box, created a filesystem, mounted it, copied a bunch of data over. Everything was great. Then I waited until the md sync was done and rebooted...

#lsblk
sdd 8:48 0 7.3T 0 disk
`-sdd1 8:49 0 7.3T 0 part
  `-md127 9:127 0 2.7T 0 raid1
sdf 8:80 0 2.7T 0 disk
`-sdf1 8:81 0 2.7T 0 part
  `-md127 9:127 0 2.7T 0 raid1
sdg 8:96 0 2.7T 0 disk
`-sdg1 8:97 0 2.7T 0 part

(omitted nonrelevant drives)

Yesterday, mirrored devices were sdd (2.7T) and sde (2.7T). After reboot, the system reordered the devices so that those drives in the USB enclosure are now sdd (2.7T) and sdg (2.7T), but mdraid decided to use sdd (7.3T) and sdf (2.7T).

One search result (here) suggested to rebuild the initramfs but used example commands that don't apply to arch. I tried my best guess, found in the arch wiki, of "mkinitcpio -p linux". After this and a reboot, nothing changed. I'm a bit iffy on this procedure, though, so I may have performed it incorrectly, or it might not be the problem.

What's the best way of telling mdadm which drives it should be persisting after each reboot? I'm assuming that every reboot it'll get a different device order from this USB enclosure for whatever reason, so I can't just hard-code the /dev/sd* devices. I thought the UUID listed in mdadm.conf and the superblocks on the mirrored drives would magically handle changes like this, but I'm obviously missing an important step somewhere along the way.

LOTS MORE DETAIL BELOW

$ uname -a
Linux zot 5.3.10-arch1-1 #1 SMP PREEMPT Sun, 10 Nov 2019 11:29:38 +0000 x86_64 GNU/Linux

$ lsb_release -a
LSB Version: 1.4
Distributor ID: Arch
Description: Arch Linux
Release: rolling
Codename: n/a

Created a mirrored volume with the following command:

# mdadm --misc --zero-superblock /dev/sdd1
# mdadm --misc --zero-superblock /dev/sde1
# mdadm --create --verbose --level=1 --metadata=1.2 --raid-devices=2 /dev/md/x /dev/sdd1 /dev/sde1
# mdadm --detail --scan >> /etc/mdadm.conf
# mdadm --assemble --scan

Broke my brain a bit trying to do the math to create the filesystem, probably should have ignored that part of the arch wiki:

# mkfs.ext4 -v -L x -b 4096 -E stride=16,stripe-width=32 /dev/md/x

Mounted Drive X:

# mount /dev/md/x /mnt/x

While it was syncing, did a test copy of a vast chunk of my data to /mnt/x

Everything's fine, went to sleep while it spent a few hours finishing the sync, checked that it's okay, rebooted. Now the array exists and seems healthy, but there's is no longer a mountable partition.

$ grep -v "^[#]" /etc/mdadm.conf
DEVICE partitions
ARRAY /dev/md/x metadata=1.2 name=zot:x UUID=94dfdaa8:c6a75958:5dedf8ff:6b1926bb

$ dmesg | grep md[0-9]
[ 19.975336] md/raid1:md127: active with 2 out of 2 mirrors
[ 20.017939] md127: detected capacity change from 0 to 3000456183808

# cat /proc/mdstat
Personalities : [raid1]
md127 : active raid1 sdf1[1] sdd1[0]
2930132992 blocks super 1.2 [2/2] [UU]
bitmap: 0/22 pages [0KB], 65536KB chunk

unused devices: <none>

# cat /proc/partitions
major minor #blocks name
<SNIPPED>
9 127 2930132992 md127

# mdadm --detail /dev/md/x
/dev/md/x:
Version : 1.2
Creation Time : Sun Nov 17 12:54:34 2019
Raid Level : raid1
Array Size : 2930132992 (2794.39 GiB 3000.46 GB)
Used Dev Size : 2930132992 (2794.39 GiB 3000.46 GB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent

Intent Bitmap : Internal

Update Time : Mon Nov 18 01:37:17 2019
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0

Consistency Policy : bitmap

Name : zot:x (local to host zot)
UUID : 94dfdaa8:c6a75958:5dedf8ff:6b1926bb
Events : 15186

Number Major Minor RaidDevice State
0 8 49 0 active sync /dev/sdd1
1 8 81 1 active sync /dev/sdf1

# parted /dev/md/x print
Error: /dev/md/x: unrecognised disk label
Model: Linux Software RAID Array (md)
Disk /dev/md/x: 3000GB
Sector size (logical/physical): 512B/4096B
Partition Table: unknown
Disk Flags:

# mount /dev/md/x /mnt/x
mount: /mnt/x: wrong fs type, bad option, bad superblock on /dev/md127, missing codepage or helper program, or other error.

Currently running applicable processes are md and md127_raid1 (/dev/md/x is equivalent to /dev/md127)

(it was at this point wherein I thought to run lsblk, hence the stuff at the top of this post)

0 Answers0