在升级到最新的 Ubuntu LTS 的过程中,我们最近将 KVM 服务器从 Ubuntu 12.04 升级到了 14.04,但现在它的虚拟机无法启动。我担心在系统崩溃的状态下继续升级可能会使问题变得更糟。
升级一切顺利,除了 LVM 快照的一些问题(我已经解决了)。
当虚拟机崩溃时,主机日志中没有任何值得注意的内容。
我尝试过启动没有网卡或硬盘的裸机虚拟机,只启动实时 Ubuntu iso,但这也不起作用。Windows 虚拟机也无法启动。
主机日志中出现的唯一可疑内容是 dmesg 中的以下内容:
[ 86.809494] cgroup: systemd-logind (1168) created nested cgroup for controller "memory" which has incomplete hierarchy support. Nested cgroups may change be
havior in the future.
[ 86.809499] cgroup: "memory" requires setting use_hierarchy to 1 on the root.
我设法通过串行接口捕获了 Ubuntu 客户机的输出,直到它崩溃。这是内核崩溃之前的输出:
[ 1.323260] serio: i8042 KBD port at 0x60,0x64 irq 1
[ 1.324122] serio: i8042 AUX port at 0x60,0x64 irq 12
[ 1.325102] mousedev: PS/2 mouse device common for all mice
[ 1.326357] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input1
[ 1.328120] rtc_cmos 00:00: RTC can wake from S4
[ 1.329176] rtc_cmos 00:00: rtc core: registered rtc_cmos as rtc0
[ 1.330329] rtc_cmos 00:00: alarms up to one day, 114 bytes nvram
[ 1.331360] i2c /dev entries driver
[ 1.331991] device-mapper: uevent: version 1.0.3
[ 1.332854] device-mapper: ioctl: 4.34.0-ioctl (2015-10-28) initialised: [email protected]
[ 1.334288] ledtrig-cpu: registered to indicate activity on CPUs
[ 1.335634] NET: Registered protocol family 10
[ 1.336650] NET: Registered protocol family 17
[ 1.337440] Key type dns_resolver registered
[ 1.338291] microcode: AMD CPU family 0xf not supported
[ 1.339288] registered taskstats version 1
[ 1.339970] Loading compiled-in X.509 certificates
[ 1.342042] Loaded X.509 cert 'Build time autogenerated kernel key: a4cff54e0cc9179d8dcdcbb195a5dba42f5df569'
[ 1.343736] zswap: loaded using pool lzo/zbud
[ 1.344712] PANIC: double fault, error_code: 0x0
[ 1.345495] CPU: 0 PID: 67 Comm: modprobe Not tainted 4.4.0-70-generic #91-Ubuntu
[ 1.346758] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[ 1.348018] task: ffff880000a30cc0 ti: ffff88001e814000 task.ti: ffff88001e814000
[ 1.348468] RIP: 0010:[<0000000000000000>] [< (null)>] (null)
[ 1.348468] RSP: 0018:00007ffe29ea5018 EFLAGS: 00010246
[ 1.348468] RAX: 000000000000000c RBX: 0000000000000001 RCX: 00007f14331fa2e9
[ 1.348468] RDX: 00007ffe29ea5280 RSI: 000055a0b7bd1310 RDI: 0000000000000000
[ 1.348468] RBP: 000055a0b7bce040 R08: 0000000000000001 R09: 00007ffe29ea52a9
[ 1.348468] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000009
[ 1.348468] R13: 00007f14331e19a0 R14: 0000000000000001 R15: 0000000000001000
[ 1.348468] FS: 0000000000000000(0000) GS:ffff88001fc00000(0000) knlGS:0000000000000000
[ 1.348468] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1.348468] CR2: 00007ffe29ea4ff8 CR3: 000000001efdf000 CR4: 00000000000006f0
[ 1.348468] Stack:
[ 1.348468] 00007f14331f9305 00007ffe29ea5299 00007ffe29eae000 0001010000000000
[ 1.348468] 0001037f00000064 0000000000000000 00000000078bfbfd 0000000000000000
[ 1.348468] 000055a0b7bd1310 0000000000000000 0000000000000001 00007ffe29ea5100
[ 1.348468] Call Trace:
[ 1.348468] <UNK>
[ 1.348468] Code: Bad RIP value.
[ 1.348468] Kernel panic - not syncing: Machine halted.
[ 1.348468] CPU: 0 PID: 67 Comm: modprobe Not tainted 4.4.0-70-generic #91-Ubuntu
[ 1.348468] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[ 1.348468] 0000000000000086 00000000e094cae7 ffff88001fc04e80 ffffffff813f82b3
[ 1.348468] ffffffff81cb7ebc ffff88001fc04f18 ffff88001fc04f08 ffffffff8118d367
[ 1.348468] 0000000000000008 ffff88001fc04f18 ffff88001fc04eb0 00000000e094cae7
[ 1.348468] Call Trace:
[ 1.348468] <#DF> [<ffffffff813f82b3>] dump_stack+0x63/0x90
[ 1.348468] [<ffffffff8118d367>] panic+0xd3/0x215
[ 1.348468] [<ffffffff81060d1d>] df_debug+0x2d/0x30
[ 1.348468] [<ffffffff8102fb8c>] do_double_fault+0x7c/0xf0
[ 1.348468] [<ffffffff8183e1f8>] double_fault+0x28/0x30
[ 1.348468] <<EOE>> <UNK>
[ 1.348468] Kernel Offset: disabled
[ 1.348468] ---[ end Kernel panic - not syncing: Machine halted.
(这不是完整的日志,因为 StackExchange 认为它是垃圾邮件)
软件包版本:
- 内核:3.13.0-169-通用#219-Ubuntu
- qemu-kvm:2.0.0+dfsg-2ubuntu1.45
- libvirt-bin:1.2.2-0ubuntu13.1.27
libvirt 执行的命令:
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin QEMU_AUDIO_DRV=none /usr/bin/kvm-spice -name borkum -S -machine pc-i440fx-trusty,acc
el=kvm,usb=off -cpu Opteron_G3 -m 512 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid acee95a1-b2e2-470c-85d7-89d85aceddc4 -no-user-config -nodefa
ults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/borkum.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,drif
tfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5 -device ich9-usb-uhci
1,masterbus=usb.0,firstport=0,bus=pci.0,addr=0x6 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x7 -device ich9-usb-uhci3,masterbus=usb.0,f
irstport=4,bus=pci.0,addr=0x8 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x9 -drive file=/dev/images/borkum,if=none,id=drive-virtio-disk0,forma
t=raw,cache=none,aio=native -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0xa,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive if=none,id=drive-ide
0-0-0,readonly=on,format=raw -device ide-cd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -netdev tap,fd=24,id=hostnet0,vhost=on,vhostfd=25 -device virtio-
net-pci,netdev=hostnet0,id=net0,mac=52:54:00:aa:55:a8,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev
spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -vnc 127.0.0.1
:0 -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -devic
e virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xb
虚拟机配置文件:
<domain type='kvm'>
<name>borkum</name>
<uuid>acee95a1-b2e2-470c-85d7-89d85aceddc4</uuid>
<memory unit='KiB'>524288</memory>
<currentMemory unit='KiB'>524288</currentMemory>
<vcpu placement='static'>1</vcpu>
<os>
<type arch='x86_64' machine='pc-i440fx-trusty'>hvm</type>
<boot dev='hd'/>
</os>
<features>
<acpi/>
<apic/>
</features>
<cpu mode='custom' match='exact'>
<model fallback='allow'>Opteron_G3</model>
</cpu>
<clock offset='utc'>
<timer name='rtc' tickpolicy='catchup'/>
<timer name='pit' tickpolicy='delay'/>
<timer name='hpet' present='no'/>
</clock>
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>restart</on_crash>
<devices>
<emulator>/usr/bin/kvm-spice</emulator>
<disk type='block' device='disk'>
<driver name='qemu' type='raw' cache='none' io='native'/>
<source dev='/dev/images/borkum'/>
<target dev='vda' bus='virtio'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/>
</disk>
<disk type='file' device='cdrom'>
<driver name='qemu' type='raw'/>
<target dev='hda' bus='ide'/>
<readonly/>
<address type='drive' controller='0' bus='0' target='0' unit='0'/>
</disk>
<controller type='usb' index='0' model='ich9-ehci1'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
</controller>
<controller type='usb' index='0' model='ich9-uhci1'>
<master startport='0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
</controller>
<controller type='usb' index='0' model='ich9-uhci2'>
<master startport='2'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
</controller>
<controller type='usb' index='0' model='ich9-uhci3'>
<master startport='4'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
</controller>
<controller type='ide' index='0'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
</controller>
<controller type='virtio-serial' index='0'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
</controller>
<controller type='pci' index='0' model='pci-root'/>
<interface type='bridge'>
<mac address='52:54:00:aa:55:a8'/>
<source bridge='br0'/>
<model type='virtio'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
</interface>
<serial type='pty'>
<target port='0'/>
</serial>
<console type='pty'>
<target type='serial' port='0'/>
</console>
<channel type='spicevmc'>
<target type='virtio' name='com.redhat.spice.0'/>
<address type='virtio-serial' controller='0' bus='0' port='1'/>
</channel>
<input type='mouse' bus='ps2'/>
<input type='keyboard' bus='ps2'/>
<graphics type='vnc' port='-1' autoport='yes'/>
<sound model='ich6'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</sound>
<video>
<model type='cirrus' vram='9216' heads='1'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
</video>
<memballoon model='virtio'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x0b' function='0x0'/>
</memballoon>
</devices>
</domain>
在 /etc/libvirt 中,libvirt.conf 和 qemu.conf 不包含任何调整后的设置,一切都是默认的。libvirtd.conf 是库存的,保存以下几行:
unix_sock_group = "libvirtd"
unix_sock_ro_perms = "0777"
unix_sock_rw_perms = "0770"
auth_unix_ro = "none"
auth_unix_rw = "none"
那么,如何让这些虚拟机再次启动,以便我们可以继续升级?
编辑:我最终在损坏的 14.04 安装上安装了全新的 Ubuntu 18.04,保留了虚拟机磁盘。这解决了问题,虚拟机现在可以启动了。