我正在尝试将多台物理服务器从 RHEL 7.9 离线升级到 RHEL 8.6。到目前为止,该过程在 1 台服务器上成功完成。在我遇到问题的服务器上,一切正常,当我重新启动服务器时,它会进入紧急模式。从紧急模式重新启动后,我可以返回到服务器的当前操作系统(RHEL 7.9)。
运行:
leapp preupgrade --no-rhsm --enablerepo local1 --enablerepo local2
返回时没有错误。
之后我运行:
leapp upgrade --no-rhsm --enablerepo local1 --enablerepo local2
并且它没有任何问题地完成。我重新启动并选择“升级 RHEL 8 initramfs”后尝试进行升级,但失败了。
这是 leapp-upgrade.log 的最后一部分
Sep 05 04:15:52 localhost systemd[1]: Reached target System Upgrade.
Sep 05 04:15:52 localhost systemd[1]: Starting System Upgrade...
Sep 05 04:15:52 localhost upgrade[1543]: starting upgrade hook
Sep 05 04:15:52 localhost upgrade[1543]: /bin/upgrade: line 19: /sysroot/var/tmp/system-upgrade.state: No such file or directory
Sep 05 04:15:52 localhost upgrade[1546]: WARNING: locking_type (4) is deprecated, using --sysinit --readonly.
Sep 05 04:15:52 localhost upgrade[1546]: Allowing activation with --readonly --sysinit.
Sep 05 04:15:52 localhost upgrade[1546]: WARNING: Couldn't find device with uuid qGbnBb-LLqH-seg7-NEX3-NPTD-veJk-KLA3SK.
Sep 05 04:15:52 localhost upgrade[1546]: WARNING: VG rhel is missing PV qGbnBb-LLqH-seg7-NEX3-NPTD-veJk-KLA3SK (last written to /dev/sdi1).
Sep 05 04:15:52 localhost upgrade[1546]: Refusing activation of partial LV rhel/root. Use '--activationmode partial' to override.
Sep 05 04:15:52 localhost upgrade[1546]: Refusing activation of partial LV rhel/data. Use '--activationmode partial' to override.
Sep 05 04:15:52 localhost upgrade[1546]: 0 logical volume(s) in volume group "rhel" now active
Sep 05 04:15:52 localhost upgrade[1546]: Allowing activation with --readonly --sysinit.
Sep 05 04:15:52 localhost upgrade[1546]: 2 logical volume(s) in volume group "vg01" now active
Sep 05 04:15:52 localhost upgrade[1567]: Spawning container sysroot on /sysroot.
Sep 05 04:15:52 localhost upgrade[1567]: Press ^] three times within 1s to kill container.
Sep 05 04:15:52 localhost kernel: EXT4-fs (md0): mounting ext2 file system using the ext4 subsystem
Sep 05 04:15:52 localhost kernel: EXT4-fs (md0): mounted filesystem without journal. Opts: (null)
Sep 05 04:15:52 localhost kernel: XFS (dm-1): Mounting V5 Filesystem
Sep 05 04:15:52 localhost kernel: XFS (dm-1): Ending clean mount
Sep 05 04:15:52 localhost upgrade[1570]: mount: special device /dev/mapper/rhel-data does not exist
Sep 05 04:15:52 localhost kernel: scsi 11:0:0:0: Direct-Access CiscoVD Hypervisor PQ: 0 ANSI: 6
Sep 05 04:15:52 localhost kernel: sd 11:0:0:0: Attached scsi generic sg8 type 0
Sep 05 04:15:52 localhost kernel: sd 11:0:0:0: [sdi] 124727295 512-byte logical blocks: (63.9 GB/59.5 GiB)
Sep 05 04:15:52 localhost kernel: sd 11:0:0:0: [sdi] Write Protect is off
Sep 05 04:15:52 localhost kernel: sd 11:0:0:0: [sdi] Mode Sense: 17 00 00 00
Sep 05 04:15:52 localhost kernel: sd 11:0:0:0: [sdi] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
Sep 05 04:15:52 localhost kernel: sdi: sdi1
Sep 05 04:15:52 localhost kernel: sd 11:0:0:0: [sdi] Attached SCSI removable disk
Sep 05 04:15:59 localhost upgrade[1585]: ==> Processing phase `InitRamStart`
Sep 05 04:15:59 localhost upgrade[1585]: ====> * remove_upgrade_boot_entry
Sep 05 04:15:59 localhost upgrade[1585]: Remove boot entry for Leapp provided initramfs.
Sep 05 04:15:59 localhost upgrade[2048]: Process Process-192:
Sep 05 04:15:59 localhost upgrade[2048]: Traceback (most recent call last):
Sep 05 04:15:59 localhost upgrade[2048]: File "/usr/lib64/python2.7/multiprocessing/process.py", line 258, in _bootstrap
Sep 05 04:15:59 localhost upgrade[2048]: self.run()
Sep 05 04:15:59 localhost upgrade[2048]: File "/usr/lib64/python2.7/multiprocessing/process.py", line 114, in run
Sep 05 04:15:59 localhost upgrade[2048]: self._target(*self._args, **self._kwargs)
Sep 05 04:15:59 localhost upgrade[2048]: File "/usr/lib/python2.7/site-packages/leapp/repository/actor_definition.py", line 72, in _do_run
Sep 05 04:15:59 localhost upgrade[2048]: actor_instance.run(*args, **kwargs)
Sep 05 04:15:59 localhost upgrade[2048]: File "/usr/lib/python2.7/site-packages/leapp/actors/__init__.py", line 290, in run
Sep 05 04:15:59 localhost upgrade[2048]: self.process(*args)
Sep 05 04:15:59 localhost upgrade[2048]: File "/usr/share/leapp-repository/repositories/system_upgrade/common/actors/removeupgradebootentry/actor.py", line 20, in process
Sep 05 04:15:59 localhost upgrade[2048]: remove_boot_entry()
Sep 05 04:15:59 localhost upgrade[2048]: File "/usr/share/leapp-repository/repositories/system_upgrade/common/actors/removeupgradebootentry/libraries/removeupgradebootentry.py", line 41, in remove_boot_entry
Sep 05 04:15:59 localhost upgrade[2048]: '/bin/mount', '-a'
Sep 05 04:15:59 localhost upgrade[2048]: File "/usr/lib/python2.7/site-packages/leapp/libraries/stdlib/__init__.py", line 188, in run
Sep 05 04:15:59 localhost upgrade[2048]: result=result
Sep 05 04:15:59 localhost upgrade[2048]: CalledProcessError: Command ['/bin/mount', '-a'] failed with exit code 32.
Sep 05 04:15:59 localhost upgrade[1585]: ==========================================================================================================
Sep 05 04:15:59 localhost upgrade[1585]: Actor remove_upgrade_boot_entry unexpectedly terminated with exit code: 1 - Please check the above details
Sep 05 04:15:59 localhost upgrade[1585]: ==========================================================================================================
Sep 05 04:15:59 localhost upgrade[1585]: Debug output written to /var/log/leapp/leapp-upgrade.log
Sep 05 04:15:59 localhost upgrade[1585]: ============================================================
Sep 05 04:15:59 localhost upgrade[1585]: REPORT
Sep 05 04:15:59 localhost upgrade[1585]: ============================================================
Sep 05 04:15:59 localhost upgrade[1585]: A report has been generated at /var/log/leapp/leapp-report.json
Sep 05 04:15:59 localhost upgrade[1585]: A report has been generated at /var/log/leapp/leapp-report.txt
Sep 05 04:15:59 localhost upgrade[1585]: ============================================================
Sep 05 04:15:59 localhost upgrade[1585]: END OF REPORT
Sep 05 04:15:59 localhost upgrade[1585]: ============================================================
Sep 05 04:15:59 localhost upgrade[1585]: Answerfile has been generated at /var/log/leapp/answerfile
Sep 05 04:15:59 localhost kernel: XFS (dm-1): Unmounting Filesystem
Sep 05 04:15:59 localhost upgrade[1567]: Container sysroot failed with error code 1.
正常启动后,所有物理驱动器、所有卷组和所有逻辑卷均报告正常。
即使丢失的 PV qGbnBb-..... 也在正常启动后挂载。能否帮助我解决上述问题,因为这对我的客户来说非常关键?
如果您需要任何进一步的日志文件或任何命令的输出,我很乐意提供。
[修改] 结果pvs -o +uuid
/dev/md1 vg01 lvm2 a-- <438.52g <348.52g o2821Q-5Re1-UiCZ-V2LJ-f7Qg-hwpX-CqD1ib
/dev/sdc1 rhel lvm2 a-- <447.13g 0 Iq1PHS-zOk5-2uCw-Ga9A-uo4F-DjkQ-WqpU5S
/dev/sdd1 rhel lvm2 a-- <447.13g 0 5co7Pi-YiaO-wPyH-Fd3N-rDyC-UPjc-1FJNYx
/dev/sde1 rhel lvm2 a-- <447.13g 0 46oNJ3-sf3n-5Lqc-6ZZv-dN3U-guzA-5s7ZRa
/dev/sdf1 rhel lvm2 a-- <447.13g 0 U2iWT4-c7lP-7Zp8-ZNTJ-EtF3-mcyQ-nDgU6A
/dev/sdg1 rhel lvm2 a-- <447.13g 0 R9vj56-XNvi-xEUu-DXLs-fPU3-MnIC-gtbZXY
/dev/sdh1 rhel lvm2 a-- <447.13g 0 WCcfjx-Ffzp-OwLz-BHGX-HIpo-qtFC-HuMPVp
/dev/sdi1 rhel lvm2 a-- <59.47g 0 qGbnBb-LLqH-seg7-NEX3-NPTD-veJk-KLA3SK
[修订2
]vgs -o +devices
VG #PV #LV #SN Attr VSize VFree Devices
rhel 7 2 0 wz--n- <2.68t 0 /dev/sdi1(2424)
rhel 7 2 0 wz--n- <2.68t 0 /dev/sdc1(0)
rhel 7 2 0 wz--n- <2.68t 0 /dev/sdd1(0)
rhel 7 2 0 wz--n- <2.68t 0 /dev/sde1(0)
rhel 7 2 0 wz--n- <2.68t 0 /dev/sdf1(0)
rhel 7 2 0 wz--n- <2.68t 0 /dev/sdg1(0)
rhel 7 2 0 wz--n- <2.68t 0 /dev/sdh1(0)
rhel 7 2 0 wz--n- <2.68t 0 /dev/sdi1(0)
vg01 1 2 0 wz--n- <438.52g <348.52g /dev/md1(0)
vg01 1 2 0 wz--n- <438.52g <348.52g /dev/md1(12800)
[修订3
]lvs --all -o +devices
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert Devices
data rhel -wi-ao---- <2.63t /dev/sdc1(0)
data rhel -wi-ao---- <2.63t /dev/sdd1(0)
data rhel -wi-ao---- <2.63t /dev/sde1(0)
data rhel -wi-ao---- <2.63t /dev/sdf1(0)
data rhel -wi-ao---- <2.63t /dev/sdg1(0)
data rhel -wi-ao---- <2.63t /dev/sdh1(0)
data rhel -wi-ao---- <2.63t /dev/sdi1(0)
root rhel -wi-a----- 50.00g /dev/sdi1(2424)
root vg01 -wi-ao---- 50.00g /dev/md1(0)
var vg01 -wi-ao---- 40.00g /dev/md1(12800)
[修正案4]
命令的结果lvscan
ACTIVE '/dev/rhel/root' [50.00 GiB] inherit
ACTIVE '/dev/rhel/data' [<2.63 TiB] inherit
ACTIVE '/dev/vg01/root' [50.00 GiB] inherit
ACTIVE '/dev/vg01/var' [40.00 GiB] inherit
答案1
缺少一些挂载点,请检查原因。
Sep 05 04:15:52 localhost upgrade[1570]: mount: special device /dev/mapper/rhel-data does not exist
和
Sep 05 04:15:59 localhost upgrade[2048]: CalledProcessError: Command ['/bin/mount', '-a'] failed with exit code 32.
答案2
因此,经过长时间的谷歌搜索和测试后,我发现了发生在我身上的以下错误:
https://bugzilla.redhat.com/show_bug.cgi?id=1927688
这个解决方法帮助我解决了这个问题,但它导致了一个新问题,即 grub 2 配置错误。因此,我按照以下方法解决了这个问题。
运行升级命令后,leapp upgrade --no-rhsm --enablerepo local1 --enablerepo local2
我按照要求重新启动以完成升级过程。A
) 重新启动并进入 grub2 菜单后,我在选择e
时输入了。B ) 在RHEL 8 Upgrade Initramfs
命令Linux,我添加了该rd.break=upgrade
选项。保存并继续。C
)进入单用户启动模式后,我按照上述链接中的说明执行命令。
sed -i 's/locking_type = 4/locking_type = 1/' /etc/lvm/lvm.conf
lvm vgchange -ay --config ' global {locking_type=1} '
lvm vgck --updatemetadata rhel
sed -i 's/locking_type = 1/locking_type = 4/' /etc/lvm/lvm.conf
exit
D) 系统按预期启动升级的 initramfs,升级成功完成。
然而,在重新启动时,我进入了 grub 救援模式。为了解决这个问题,我使用了一个 RHEL 8.6 媒体,我必须执行离线升级并从该媒体启动。E
) 使用上述媒体启动后,我选择修复已安装的操作系统,并按照提示获取正确的 shell 访问权限。这样做会自动选择我的 grub 安装在哪里(对于我的情况是 /dev/md),然后我通过以下方式修复了我的 grub:
grub2-mkconfig -o /boot/grub2/grub.cfg
grub2-install /dev/md
reboot
这就是我全新的 RHEL 8.6 启动选项。我像往常一样进入系统。