0

I am trying to upgrade several physical servers from RHEL 7.9 to RHEL 8.6 offline. So far the process was successful on 1 server. On the server that I have the issue everything works fine and when I reboot the server it goes to emergency mode. Rebooting from emergency mode I can go back to the server's current operating system (RHEL 7.9).
Running:
leapp preupgrade --no-rhsm --enablerepo local1 --enablerepo local2
comes back with no errors.
After that I am running:
leapp upgrade --no-rhsm --enablerepo local1 --enablerepo local2
and it completes without any issues. I do the reboot and after choosing "Upgrade RHEL 8 initramfs" is trying to do the upgrade and it fails.
Here is the last part of the leapp-upgrade.log

Sep 05 04:15:52 localhost systemd[1]: Reached target System Upgrade.
Sep 05 04:15:52 localhost systemd[1]: Starting System Upgrade...
Sep 05 04:15:52 localhost upgrade[1543]: starting upgrade hook
Sep 05 04:15:52 localhost upgrade[1543]: /bin/upgrade: line 19: /sysroot/var/tmp/system-upgrade.state: No such file or directory
Sep 05 04:15:52 localhost upgrade[1546]:   WARNING: locking_type (4) is deprecated, using --sysinit --readonly.
Sep 05 04:15:52 localhost upgrade[1546]:   Allowing activation with --readonly --sysinit.
Sep 05 04:15:52 localhost upgrade[1546]:   WARNING: Couldn't find device with uuid qGbnBb-LLqH-seg7-NEX3-NPTD-veJk-KLA3SK.
Sep 05 04:15:52 localhost upgrade[1546]:   WARNING: VG rhel is missing PV qGbnBb-LLqH-seg7-NEX3-NPTD-veJk-KLA3SK (last written to /dev/sdi1).
Sep 05 04:15:52 localhost upgrade[1546]:   Refusing activation of partial LV rhel/root.  Use '--activationmode partial' to override.
Sep 05 04:15:52 localhost upgrade[1546]:   Refusing activation of partial LV rhel/data.  Use '--activationmode partial' to override.
Sep 05 04:15:52 localhost upgrade[1546]:   0 logical volume(s) in volume group "rhel" now active
Sep 05 04:15:52 localhost upgrade[1546]:   Allowing activation with --readonly --sysinit.
Sep 05 04:15:52 localhost upgrade[1546]:   2 logical volume(s) in volume group "vg01" now active
Sep 05 04:15:52 localhost upgrade[1567]: Spawning container sysroot on /sysroot.
Sep 05 04:15:52 localhost upgrade[1567]: Press ^] three times within 1s to kill container.
Sep 05 04:15:52 localhost kernel: EXT4-fs (md0): mounting ext2 file system using the ext4 subsystem
Sep 05 04:15:52 localhost kernel: EXT4-fs (md0): mounted filesystem without journal. Opts: (null)
Sep 05 04:15:52 localhost kernel: XFS (dm-1): Mounting V5 Filesystem
Sep 05 04:15:52 localhost kernel: XFS (dm-1): Ending clean mount
Sep 05 04:15:52 localhost upgrade[1570]: mount: special device /dev/mapper/rhel-data does not exist
Sep 05 04:15:52 localhost kernel: scsi 11:0:0:0: Direct-Access     CiscoVD  Hypervisor            PQ: 0 ANSI: 6
Sep 05 04:15:52 localhost kernel: sd 11:0:0:0: Attached scsi generic sg8 type 0
Sep 05 04:15:52 localhost kernel: sd 11:0:0:0: [sdi] 124727295 512-byte logical blocks: (63.9 GB/59.5 GiB)
Sep 05 04:15:52 localhost kernel: sd 11:0:0:0: [sdi] Write Protect is off
Sep 05 04:15:52 localhost kernel: sd 11:0:0:0: [sdi] Mode Sense: 17 00 00 00
Sep 05 04:15:52 localhost kernel: sd 11:0:0:0: [sdi] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
Sep 05 04:15:52 localhost kernel:  sdi: sdi1
Sep 05 04:15:52 localhost kernel: sd 11:0:0:0: [sdi] Attached SCSI removable disk
Sep 05 04:15:59 localhost upgrade[1585]: ==> Processing phase `InitRamStart`
Sep 05 04:15:59 localhost upgrade[1585]: ====> * remove_upgrade_boot_entry
Sep 05 04:15:59 localhost upgrade[1585]:         Remove boot entry for Leapp provided initramfs.
Sep 05 04:15:59 localhost upgrade[2048]: Process Process-192:
Sep 05 04:15:59 localhost upgrade[2048]: Traceback (most recent call last):
Sep 05 04:15:59 localhost upgrade[2048]:   File "/usr/lib64/python2.7/multiprocessing/process.py", line 258, in _bootstrap
Sep 05 04:15:59 localhost upgrade[2048]:     self.run()
Sep 05 04:15:59 localhost upgrade[2048]:   File "/usr/lib64/python2.7/multiprocessing/process.py", line 114, in run
Sep 05 04:15:59 localhost upgrade[2048]:     self._target(*self._args, **self._kwargs)
Sep 05 04:15:59 localhost upgrade[2048]:   File "/usr/lib/python2.7/site-packages/leapp/repository/actor_definition.py", line 72, in _do_run
Sep 05 04:15:59 localhost upgrade[2048]:     actor_instance.run(*args, **kwargs)
Sep 05 04:15:59 localhost upgrade[2048]:   File "/usr/lib/python2.7/site-packages/leapp/actors/__init__.py", line 290, in run
Sep 05 04:15:59 localhost upgrade[2048]:     self.process(*args)
Sep 05 04:15:59 localhost upgrade[2048]:   File "/usr/share/leapp-repository/repositories/system_upgrade/common/actors/removeupgradebootentry/actor.py", line 20, in process
Sep 05 04:15:59 localhost upgrade[2048]:     remove_boot_entry()
Sep 05 04:15:59 localhost upgrade[2048]:   File "/usr/share/leapp-repository/repositories/system_upgrade/common/actors/removeupgradebootentry/libraries/removeupgradebootentry.py", line 41, in remove_boot_entry
Sep 05 04:15:59 localhost upgrade[2048]:     '/bin/mount', '-a'
Sep 05 04:15:59 localhost upgrade[2048]:   File "/usr/lib/python2.7/site-packages/leapp/libraries/stdlib/__init__.py", line 188, in run
Sep 05 04:15:59 localhost upgrade[2048]:     result=result
Sep 05 04:15:59 localhost upgrade[2048]: CalledProcessError: Command ['/bin/mount', '-a'] failed with exit code 32.
Sep 05 04:15:59 localhost upgrade[1585]: ==========================================================================================================
Sep 05 04:15:59 localhost upgrade[1585]: Actor remove_upgrade_boot_entry unexpectedly terminated with exit code: 1 - Please check the above details
Sep 05 04:15:59 localhost upgrade[1585]: ==========================================================================================================
Sep 05 04:15:59 localhost upgrade[1585]: Debug output written to /var/log/leapp/leapp-upgrade.log
Sep 05 04:15:59 localhost upgrade[1585]: ============================================================
Sep 05 04:15:59 localhost upgrade[1585]:                            REPORT
Sep 05 04:15:59 localhost upgrade[1585]: ============================================================
Sep 05 04:15:59 localhost upgrade[1585]: A report has been generated at /var/log/leapp/leapp-report.json
Sep 05 04:15:59 localhost upgrade[1585]: A report has been generated at /var/log/leapp/leapp-report.txt
Sep 05 04:15:59 localhost upgrade[1585]: ============================================================
Sep 05 04:15:59 localhost upgrade[1585]:                        END OF REPORT
Sep 05 04:15:59 localhost upgrade[1585]: ============================================================
Sep 05 04:15:59 localhost upgrade[1585]: Answerfile has been generated at /var/log/leapp/answerfile
Sep 05 04:15:59 localhost kernel: XFS (dm-1): Unmounting Filesystem
Sep 05 04:15:59 localhost upgrade[1567]: Container sysroot failed with error code 1.

All the physical drives are reporting ok, all the volume groups are reporting ok and all logical volumes are reporting ok after going to a normal boot.
Even the missing PV qGbnBb-..... is mounted after a normal boot. Could please help me overcome the above issue as it is very critical for my client?
If you need any further log files or output of any command I will be happy to provide it.

[Amendment] The result of pvs -o +uuid

  /dev/md1   vg01 lvm2 a--  <438.52g <348.52g o2821Q-5Re1-UiCZ-V2LJ-f7Qg-hwpX-CqD1ib
  /dev/sdc1  rhel lvm2 a--  <447.13g       0  Iq1PHS-zOk5-2uCw-Ga9A-uo4F-DjkQ-WqpU5S
  /dev/sdd1  rhel lvm2 a--  <447.13g       0  5co7Pi-YiaO-wPyH-Fd3N-rDyC-UPjc-1FJNYx
  /dev/sde1  rhel lvm2 a--  <447.13g       0  46oNJ3-sf3n-5Lqc-6ZZv-dN3U-guzA-5s7ZRa
  /dev/sdf1  rhel lvm2 a--  <447.13g       0  U2iWT4-c7lP-7Zp8-ZNTJ-EtF3-mcyQ-nDgU6A
  /dev/sdg1  rhel lvm2 a--  <447.13g       0  R9vj56-XNvi-xEUu-DXLs-fPU3-MnIC-gtbZXY
  /dev/sdh1  rhel lvm2 a--  <447.13g       0  WCcfjx-Ffzp-OwLz-BHGX-HIpo-qtFC-HuMPVp
  /dev/sdi1  rhel lvm2 a--   <59.47g       0  qGbnBb-LLqH-seg7-NEX3-NPTD-veJk-KLA3SK

[Amendment 2]
The results of vgs -o +devices

  VG   #PV #LV #SN Attr   VSize    VFree    Devices
  rhel   7   2   0 wz--n-   <2.68t       0  /dev/sdi1(2424)
  rhel   7   2   0 wz--n-   <2.68t       0  /dev/sdc1(0)
  rhel   7   2   0 wz--n-   <2.68t       0  /dev/sdd1(0)
  rhel   7   2   0 wz--n-   <2.68t       0  /dev/sde1(0)
  rhel   7   2   0 wz--n-   <2.68t       0  /dev/sdf1(0)
  rhel   7   2   0 wz--n-   <2.68t       0  /dev/sdg1(0)
  rhel   7   2   0 wz--n-   <2.68t       0  /dev/sdh1(0)
  rhel   7   2   0 wz--n-   <2.68t       0  /dev/sdi1(0)
  vg01   1   2   0 wz--n- <438.52g <348.52g /dev/md1(0)
  vg01   1   2   0 wz--n- <438.52g <348.52g /dev/md1(12800)

[Amendment 3]
The results of lvs --all -o +devices

  LV   VG   Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert Devices
  data rhel -wi-ao---- <2.63t                                                     /dev/sdc1(0)
  data rhel -wi-ao---- <2.63t                                                     /dev/sdd1(0)
  data rhel -wi-ao---- <2.63t                                                     /dev/sde1(0)
  data rhel -wi-ao---- <2.63t                                                     /dev/sdf1(0)
  data rhel -wi-ao---- <2.63t                                                     /dev/sdg1(0)
  data rhel -wi-ao---- <2.63t                                                     /dev/sdh1(0)
  data rhel -wi-ao---- <2.63t                                                     /dev/sdi1(0)
  root rhel -wi-a----- 50.00g                                                     /dev/sdi1(2424)
  root vg01 -wi-ao---- 50.00g                                                     /dev/md1(0)
  var  vg01 -wi-ao---- 40.00g                                                     /dev/md1(12800)

[Amendment 4]
The results of the command lvscan

  ACTIVE            '/dev/rhel/root' [50.00 GiB] inherit
  ACTIVE            '/dev/rhel/data' [<2.63 TiB] inherit
  ACTIVE            '/dev/vg01/root' [50.00 GiB] inherit
  ACTIVE            '/dev/vg01/var' [40.00 GiB] inherit
Chris Hatzis
  • 11
  • 1
  • 4

2 Answers2

0

Some mount points are missing check why.

Sep 05 04:15:52 localhost upgrade[1570]: mount: special device /dev/mapper/rhel-data does not exist

And

Sep 05 04:15:59 localhost upgrade[2048]: CalledProcessError: Command ['/bin/mount', '-a'] failed with exit code 32.
asktyagi
  • 2,860
  • 2
  • 8
  • 25
  • the result from the command ```mount | grep mapper```: ```/dev/mapper/vg01-root on / type xfs (rw,relatime,attr2,inode64,noquota) /dev/mapper/vg01-var on /var type xfs (rw,relatime,attr2,inode64,noquota) /dev/mapper/rhel-data on /data type ext4 (rw,relatime,data=ordered)``` To me, there are all there – Chris Hatzis Sep 06 '22 at 04:20
  • try `mount -a` in single user mode to validate post upgrade. Also share if pv `qGbnBb-LLqH-seg7-NEX3-NPTD-veJk-KLA3SK.` is actually exists and usable or not. Add these details to your question please. – asktyagi Sep 06 '22 at 04:23
  • I cannot run at this stage the ```mount -a```, as the server is a production server and need to do it after hours. I just posted the results of ```pvs -o +uuid``` which the showcase the "missing" pv is available. – Chris Hatzis Sep 06 '22 at 04:39
  • can you add output of `vgs --options +devices` and `lvs --all --options +devices` to question? – asktyagi Sep 06 '22 at 04:44
0

So after a long googling around and testing I discovered the following bug which was happening to me: https://bugzilla.redhat.com/show_bug.cgi?id=1927688 The work-around helped me to over come the issue but it led to a new issue which was a faulty grub 2 configuration. So here is what I did to resolve my issue.
After running the upgrade command leapp upgrade --no-rhsm --enablerepo local1 --enablerepo local2 I did a reboot as requested to complete the upgrade process.
A) Upon rebooting and entering the grub2 menu I typed e while the RHEL 8 Upgrade Initramfs was selected.
B) At the end of CMDLINUX, I added the rd.break=upgrade option. Save and continue.
C) After entering single user boot mode I execute the commands as described in the above link.

sed -i 's/locking_type = 4/locking_type = 1/' /etc/lvm/lvm.conf
lvm vgchange -ay --config ' global {locking_type=1} '
lvm vgck --updatemetadata rhel
sed -i 's/locking_type = 1/locking_type = 4/' /etc/lvm/lvm.conf
exit

D) The system boot as expected the initramfs of the upgrade and the upgrade completed successfully.
However on reboot, I got into grub rescue mode. To resolve this, I used a RHEL 8.6 media that I had to perform the off-line upgrade and boot from that media.
E) After booting with the above media, I chose to repair an installed OS, and I follow the prompts to get the right shell access. Doing so it automatically chose where my grub was installed (for my case /dev/md) and from there I repaired my grub by:

grub2-mkconfig -o /boot/grub2/grub.cfg
grub2-install /dev/md
reboot

And there it was my brand new RHEL 8.6 boot option. I got into the system as per usual.

Chris Hatzis
  • 11
  • 1
  • 4