0

Recently, after a reboot, the system landed in Emergency Mode. The cause appears to be that the system is failing to mount a few of the disks as defined in /etc/fstab. These disks are Logical Volumes.

Of the 6 Logical Volumes on the server, 3 are working ok and 3 are not starting up. All of these are on the one PV (Physical Volume), and in the same Volume Group.

Some relevant errors from journalctl -xb are (repeated for each of the 3 failed LVs):

Job dev-mapper-cloudlinuz\x2dvar.device/start timed out.
Timed out waiting for device dev-mapper-cloudlinuz\x2dtmp.device

The lvscan and lvdisplay commands show these LVs as "NOT Available". View pv, vg, lv status - screenshot

Running lvchange -ay on the LV does not display an error (or any output) but the LV remains NOT Available. Similarly, running vgchange -ay cloudlinux displays the output: 2 logical volume(s) in volume groups "cloudlinux" now active. View output from lvchange and vgchange - screenshot

Booting from a CentOS Live Disk into recovery mode mounts the Volumes with no issue and all files are present. fsck reports no errors on the Volumes. View lvscan output from recovery mode - screenshot

I also tried booting into an older Kernel from the boot screen. This did not help (LVs could not be activated).

  • Check `dmesg` for anything interesting or unusual. – Michael Hampton Oct 16 '20 at 01:25
  • I can't see anything useful in ```dmesg```. I have loaded the output [here](http://dev2.webbird.net.au/dmesg2.txt). Note that all Logical Volumes are on the partition xvda2. – Pierowheelz Oct 16 '20 at 01:57
  • After a second look, perhaps this line is relevant: ```[ 0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-3.10.0-962.3.2.lve1.5.27.el7.x86_64 root=/dev/mapper/cloudlinux-root ro crashkernel=auto rd.lvm.lv=cloudlinux/root rd.lvm.lv=cloudlinux/swap rd.lvm.lv=cloudlinux/usr rhgb quiet LANG=en_GB.UTF-8``` The three LVs mentioned here are the working ones. I have no idea where this comes from though, or what it does. – Pierowheelz Oct 16 '20 at 02:01
  • Ok, so adding the other 3 LVs to the `GRUB_CMDLINE_LINUX` parameter in `/etc/default/grub`, then running `grub2-mkconfig –o /boot/grub2/grub.cfg`, allows the server to load normally and all LVs are connected. Strangely other working servers do not have all LVs loading in GRUB parameters. I guess, this is sort of solved, but I would still like to know what went wrong before we put the server back into production. – Pierowheelz Oct 16 '20 at 02:40
  • You should just be able to use `rd.auto` to autodetect all storage each boot, but this adds a few seconds to the boot time. – Michael Hampton Oct 16 '20 at 02:45

0 Answers0