1

I created a server setup with uefi boot and encrypted lvm partitions as you may can see here:

root@debian:~# lsblk -o name,uuid,type,size
NAME                UUID                                   TYPE   SIZE
sda                                                        disk    52G
|-sda1              6EE2-5855                              part   512M
|-sda2              2dcfe94a-3c21-e08f-805c-8ef32eabf50f   part   954M
| `-md128           8e000041-b831-4aea-b46d-85efdcdd5371   raid1  953M
`-sda3              a6172091-d026-463f-ac1b-8edfe2419cb8   part  50.5G
  `-md129           429397a0-8620-4b87-ac2d-aec37bd26b36   raid1 50.1G
    `-md129_crypt   YR7iyx-6ELU-Vg6F-bPtB-mSOA-LYJw-JQYPpK crypt   50G
      |-vgroot-main ddf156a7-3b64-4c9e-a9cc-eb15755f1995   lvm     14G
      |-vgroot-swap 0a1af000-c578-4d94-a90a-2cd685409f99   lvm    1.9G
      `-vgroot-vm0                                         lvm     14G
sdb                                                        disk    52G
|-sdb1              03F8-4D52                              part   512M
|-sdb2              2dcfe94a-3c21-e08f-805c-8ef32eabf50f   part   954M
| `-md128           8e000041-b831-4aea-b46d-85efdcdd5371   raid1  953M
`-sdb3              a6172091-d026-463f-ac1b-8edfe2419cb8   part  50.5G
  `-md129           429397a0-8620-4b87-ac2d-aec37bd26b36   raid1 50.1G
    `-md129_crypt   YR7iyx-6ELU-Vg6F-bPtB-mSOA-LYJw-JQYPpK crypt   50G
      |-vgroot-main ddf156a7-3b64-4c9e-a9cc-eb15755f1995   lvm     14G
      |-vgroot-swap 0a1af000-c578-4d94-a90a-2cd685409f99   lvm    1.9G
      `-vgroot-vm0                                         lvm     14G
sdc                                                        disk     8G
`-sdc1              236c1834-cc33-9303-0905-1bd4de8c1399   part     8G
  `-md127           25cdc3e1-d3cf-4592-9db9-538e0c9205bd   raid1    8G
sdd                                                        disk     8G
`-sdd1              236c1834-cc33-9303-0905-1bd4de8c1399   part     8G
  `-md127           25cdc3e1-d3cf-4592-9db9-538e0c9205bd   raid1    8G
sr0                                                        rom   1024M

The is based on Debian 10. The partitions are mounted as follows:

root@debian:~# df
Filesystem              1K-blocks    Used Available Use% Mounted on
udev                      2003472       0   2003472   0% /dev
tmpfs                      404052    5564    398488   2% /run
/dev/mapper/vgroot-main  14647296 1041884  11566020   9% /
tmpfs                     2020252       0   2020252   0% /dev/shm
tmpfs                        5120       0      5120   0% /run/lock
tmpfs                     2020252       0   2020252   0% /sys/fs/cgroup
/dev/md128                 960504   51472    860240   6% /boot
/dev/sdb1                  523248   10432    512816   2% /boot/efi
tmpfs                      404048       0    404048   0% /run/user/0

The root partition is situated on vgroot-main. This must be decrypted with luks on boot and is also mirrored with raid1.

root@debian:~# mdadm --detail /dev/md/129
/dev/md/129:
           Version : 1.2
     Creation Time : Thu Oct  3 18:53:03 2019
        Raid Level : raid1
        Array Size : 52487168 (50.06 GiB 53.75 GB)
     Used Dev Size : 52487168 (50.06 GiB 53.75 GB)
      Raid Devices : 2
     Total Devices : 2
       Persistence : Superblock is persistent

       Update Time : Fri Oct  4 21:55:24 2019
             State : clean 
    Active Devices : 2
   Working Devices : 2
    Failed Devices : 0
     Spare Devices : 0

Consistency Policy : resync

              Name : debian:129  (local to host debian)
              UUID : a6172091:d026463f:ac1b8edf:e2419cb8
            Events : 517

    Number   Major   Minor   RaidDevice State
       3       8        3        0      active sync   /dev/sda3
       2       8       19        1      active sync   /dev/sdb3

The setup works fine. But when I remove one disk, the system can't mount the root filesystem. After a timeout it falls back to initramfs shell. Surprisingly md129 is then displayed as inactive and displayed as raid0.

mdadm output

Does anybody know why the kernel is assembling the raid as level 0 unless the partition sda3 itself is displayed as raid1?

boot output

user543229
  • 11
  • 1

2 Answers2

0

This is probably an instance of that same problem described in this bugzilla report. It due to a bad interaction between dracut and the systems unit that should assemble the root array.

Be sure to update your system, as the problem can/should be fixed by a dracut update. As a workaround, try passing rd.retry=30 to the kernel cmdline (at grub prompt).

Additional details, taken from bugzill, explaining the event sequence:

  • mdadm --incremental will not start/run an array which is unexpectedly found degraded;
  • dracut should force-start the array after 2/3 of the timeout value passed. With current RHEL default, this amount to 180/3*2 = 120s;
  • systemctl expect to mount the root filesystem in at most 90s. If it does not succeed, it abort the dracut script and drop to an emergency shell. Being 90s lower than dracut timeout, it means that dracut does not have a chance to force-start the array.

Lowering rd.retry timeout (setting as the man page suggests) enables dracut to force-start the array, allowing the systemctl service to succeed.

shodanshok
  • 47,711
  • 7
  • 111
  • 180
0

Finally I assembled the raid manually in the initramfs. After exiting the initramfs shell the kernel continues with boot. I couldn't find a fix for the initial behavior but i now know how to handle it.