0

I just went through upgrading from Debian 9 to Debian 10 and I am having a weird issue. Probably not too weird to people that do more intensive sysadmin things, which is why I bring it here:

During the whole upgrade procedure, I also upgraded my kernel to this:

$ dpkg -l | grep linux-image
ii  linux-image-4.19.0-12-amd64       4.19.152-1                                       amd64        Linux 4.19 for 64-bit PCs (signed)
ii  linux-image-amd64                 4.19+105+deb10u7                                 amd64        Linux for 64-bit PCs (meta-package)

However, when rebooting, the server just doesn't go online and instead triggers Kimsufi (OVH redistributor/reseller) automated warning that there is an issue with the server (as if I wouldn't know that after waiting a hour for a reboot...). Their automated response reboots the server into a "rescue mode", or more specifically, into this:

$ uname -r
4.19.62-mod-std-ipv6-64-rescue

Obviously, the kernel I have installed (4.19.0) is actually older than their rescue kernel (4.19.62). But I doubt this is the issue. ...or is it?

How can I, while booted into the rescue kernel, figure out what prevented my maschine from booting? /var/log/boot.log doesn't exist and messages only has log messages from the rescue kernel boots - none hinting at an attempt to boot mine.

For completion, here is a gist with the GRUB config that was generated: https://gist.github.com/IngwiePhoenix/315df5d75551ce1f4d5f61e34fdb9956

Because harddrive schemas aren't unimportant for booting:

Device     Boot      Start        End    Sectors  Size Id Type
/dev/sda1  *          4096    1050623    1046528  511M 83 Linux
/dev/sda2          1050624 3863281250 3862230627  1.8T 83 Linux
/dev/sda4       3905972224 3907018751    1046528  511M 82 Linux swap / Solaris
Michael Hampton
  • 244,070
  • 43
  • 506
  • 972
  • 1
    Have you tried mounting your filesystem to check for `/var/log/boot.log`? – Ginnungagap Nov 21 '20 at 01:04
  • You need to look at the console but I don't know if Kimsufi gives you access to that. – Michael Hampton Nov 21 '20 at 01:15
  • @Ginnungagap I did, after booting, since the main drive is ment to be the mounted device and all, look for `/var/log/boot.log` but couldn't find it. I did however find out what caused the issue... – Ingwie Phoenix Nov 21 '20 at 07:12
  • @MichaelHampton No "remote hands" kinda feature from them - just that you can literally boot with your root drive hooked and a different kernel. The booting part was what caused the issue, actually. See my answer below. – Ingwie Phoenix Nov 21 '20 at 07:13
  • "Obviously, the kernel I have installed (4.19.0) is actually older ". Actually it's a 4.19.152 kernel. It's renamed like this by Debian for ABI compatibility purposes (-12 is the 12th (or 11th) abi change), so upgrading the kernel doesn't always require to upgrade/recompile external kernel modules. – A.B Nov 21 '20 at 12:06

1 Answers1

0

After a lot of looking around, I kept singling out GRUB more and more - and eventually came across the /etc/defaults/grub file and was presented with a realization: GRUB had saved a default boot method - one that was gone!

After removing the "ovhkernel", I had effectively stripped GRUB of it's default booting option. So, the server didn't even boot at all, it just sat at a confused GRUB screen (which I couldn't see as Kimsufi doesn't have a "remote kvm" feature).

By using their option to boot with a different kernel and use my main drive as the root drive, I was able to just re-configure GRUB and set a new default. Guess what - that was the fix.

So, if you update your kernel, you might want to set a GRUB default. May save you some frustration. I wasted about 5-7 hours on this - time I could've spent much more productive, for something that was effectively just running a single command...