Ubuntu 升级至 14.04 后 KVM 客户机崩溃

Ubuntu 升级至 14.04 后 KVM 客户机崩溃

在升级到最新的 Ubuntu LTS 的过程中,我们最近将 KVM 服务器从 Ubuntu 12.04 升级到了 14.04,但现在它的虚拟机无法启动。我担心在系统崩溃的状态下继续升级可能会使问题变得更糟。

升级一切顺利,除了 LVM 快照的一些问题(我已经解决了)。

当虚拟机崩溃时,主机日志中没有任何值得注意的内容。

我尝试过启动没有网卡或硬盘的裸机虚拟机,只启动实时 Ubuntu iso,但这也不起作用。Windows 虚拟机也无法启动。

主机日志中出现的唯一可疑内容是 dmesg 中的以下内容:

[   86.809494] cgroup: systemd-logind (1168) created nested cgroup for controller "memory" which has incomplete hierarchy support. Nested cgroups may change be
havior in the future.                                                                                                                                          
[   86.809499] cgroup: "memory" requires setting use_hierarchy to 1 on the root.                                                                               

我设法通过串行接口捕获了 Ubuntu 客户机的输出,直到它崩溃。这是内核崩溃之前的输出:

[    1.323260] serio: i8042 KBD port at 0x60,0x64 irq 1
[    1.324122] serio: i8042 AUX port at 0x60,0x64 irq 12
[    1.325102] mousedev: PS/2 mouse device common for all mice
[    1.326357] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input1
[    1.328120] rtc_cmos 00:00: RTC can wake from S4
[    1.329176] rtc_cmos 00:00: rtc core: registered rtc_cmos as rtc0
[    1.330329] rtc_cmos 00:00: alarms up to one day, 114 bytes nvram
[    1.331360] i2c /dev entries driver
[    1.331991] device-mapper: uevent: version 1.0.3
[    1.332854] device-mapper: ioctl: 4.34.0-ioctl (2015-10-28) initialised: [email protected]
[    1.334288] ledtrig-cpu: registered to indicate activity on CPUs
[    1.335634] NET: Registered protocol family 10
[    1.336650] NET: Registered protocol family 17
[    1.337440] Key type dns_resolver registered
[    1.338291] microcode: AMD CPU family 0xf not supported
[    1.339288] registered taskstats version 1
[    1.339970] Loading compiled-in X.509 certificates
[    1.342042] Loaded X.509 cert 'Build time autogenerated kernel key: a4cff54e0cc9179d8dcdcbb195a5dba42f5df569'
[    1.343736] zswap: loaded using pool lzo/zbud
[    1.344712] PANIC: double fault, error_code: 0x0
[    1.345495] CPU: 0 PID: 67 Comm: modprobe Not tainted 4.4.0-70-generic #91-Ubuntu
[    1.346758] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[    1.348018] task: ffff880000a30cc0 ti: ffff88001e814000 task.ti: ffff88001e814000
[    1.348468] RIP: 0010:[<0000000000000000>]  [<          (null)>]           (null)
[    1.348468] RSP: 0018:00007ffe29ea5018  EFLAGS: 00010246
[    1.348468] RAX: 000000000000000c RBX: 0000000000000001 RCX: 00007f14331fa2e9
[    1.348468] RDX: 00007ffe29ea5280 RSI: 000055a0b7bd1310 RDI: 0000000000000000
[    1.348468] RBP: 000055a0b7bce040 R08: 0000000000000001 R09: 00007ffe29ea52a9
[    1.348468] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000009
[    1.348468] R13: 00007f14331e19a0 R14: 0000000000000001 R15: 0000000000001000
[    1.348468] FS:  0000000000000000(0000) GS:ffff88001fc00000(0000) knlGS:0000000000000000
[    1.348468] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[    1.348468] CR2: 00007ffe29ea4ff8 CR3: 000000001efdf000 CR4: 00000000000006f0
[    1.348468] Stack:
[    1.348468]  00007f14331f9305 00007ffe29ea5299 00007ffe29eae000 0001010000000000
[    1.348468]  0001037f00000064 0000000000000000 00000000078bfbfd 0000000000000000
[    1.348468]  000055a0b7bd1310 0000000000000000 0000000000000001 00007ffe29ea5100
[    1.348468] Call Trace:
[    1.348468]  <UNK> 
[    1.348468] Code:  Bad RIP value.
[    1.348468] Kernel panic - not syncing: Machine halted.
[    1.348468] CPU: 0 PID: 67 Comm: modprobe Not tainted 4.4.0-70-generic #91-Ubuntu
[    1.348468] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[    1.348468]  0000000000000086 00000000e094cae7 ffff88001fc04e80 ffffffff813f82b3
[    1.348468]  ffffffff81cb7ebc ffff88001fc04f18 ffff88001fc04f08 ffffffff8118d367
[    1.348468]  0000000000000008 ffff88001fc04f18 ffff88001fc04eb0 00000000e094cae7
[    1.348468] Call Trace:
[    1.348468]  <#DF>  [<ffffffff813f82b3>] dump_stack+0x63/0x90
[    1.348468]  [<ffffffff8118d367>] panic+0xd3/0x215
[    1.348468]  [<ffffffff81060d1d>] df_debug+0x2d/0x30
[    1.348468]  [<ffffffff8102fb8c>] do_double_fault+0x7c/0xf0
[    1.348468]  [<ffffffff8183e1f8>] double_fault+0x28/0x30
[    1.348468]  <<EOE>>  <UNK> 
[    1.348468] Kernel Offset: disabled
[    1.348468] ---[ end Kernel panic - not syncing: Machine halted.

(这不是完整的日志,因为 StackExchange 认为它是垃圾邮件)

软件包版本:

  • 内核:3.13.0-169-通用#219-Ubuntu
  • qemu-kvm:2.0.0+dfsg-2ubuntu1.45
  • libvirt-bin:1.2.2-0ubuntu13.1.27

libvirt 执行的命令:

LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin QEMU_AUDIO_DRV=none /usr/bin/kvm-spice -name borkum -S -machine pc-i440fx-trusty,acc
el=kvm,usb=off -cpu Opteron_G3 -m 512 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid acee95a1-b2e2-470c-85d7-89d85aceddc4 -no-user-config -nodefa
ults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/borkum.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,drif
tfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5 -device ich9-usb-uhci
1,masterbus=usb.0,firstport=0,bus=pci.0,addr=0x6 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x7 -device ich9-usb-uhci3,masterbus=usb.0,f
irstport=4,bus=pci.0,addr=0x8 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x9 -drive file=/dev/images/borkum,if=none,id=drive-virtio-disk0,forma
t=raw,cache=none,aio=native -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0xa,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive if=none,id=drive-ide
0-0-0,readonly=on,format=raw -device ide-cd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -netdev tap,fd=24,id=hostnet0,vhost=on,vhostfd=25 -device virtio-
net-pci,netdev=hostnet0,id=net0,mac=52:54:00:aa:55:a8,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev
 spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -vnc 127.0.0.1
:0 -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -devic
e virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xb                                                                                                            

虚拟机配置文件:

<domain type='kvm'>
  <name>borkum</name>
  <uuid>acee95a1-b2e2-470c-85d7-89d85aceddc4</uuid>
  <memory unit='KiB'>524288</memory>
  <currentMemory unit='KiB'>524288</currentMemory>
  <vcpu placement='static'>1</vcpu>
  <os>
    <type arch='x86_64' machine='pc-i440fx-trusty'>hvm</type>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
  </features>
  <cpu mode='custom' match='exact'>
    <model fallback='allow'>Opteron_G3</model>
  </cpu>
  <clock offset='utc'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/bin/kvm-spice</emulator>
    <disk type='block' device='disk'>
      <driver name='qemu' type='raw' cache='none' io='native'/>
      <source dev='/dev/images/borkum'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <target dev='hda' bus='ide'/>
      <readonly/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    <controller type='usb' index='0' model='ich9-ehci1'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci1'>
      <master startport='0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci2'>
      <master startport='2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci3'>
      <master startport='4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </controller>
    <controller type='ide' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'/>
    <interface type='bridge'>
      <mac address='52:54:00:aa:55:a8'/>
      <source bridge='br0'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
    <serial type='pty'>
      <target port='0'/>
    </serial>
    <console type='pty'>
      <target type='serial' port='0'/>
    </console>
    <channel type='spicevmc'>
      <target type='virtio' name='com.redhat.spice.0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <graphics type='vnc' port='-1' autoport='yes'/>
    <sound model='ich6'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </sound>
    <video>
      <model type='cirrus' vram='9216' heads='1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x0b' function='0x0'/>
    </memballoon>
  </devices>
</domain>

在 /etc/libvirt 中,libvirt.conf 和 qemu.conf 不包含任何调整后的设置,一切都是默认的。libvirtd.conf 是库存的,保存以下几行:

unix_sock_group = "libvirtd"
unix_sock_ro_perms = "0777"
unix_sock_rw_perms = "0770"
auth_unix_ro = "none"
auth_unix_rw = "none"

那么,如何让这些虚拟机再次启动,以便我们可以继续升级?

编辑:我最终在损坏的 14.04 安装上安装了全新的 Ubuntu 18.04,保留了虚拟机磁盘。这解决了问题,虚拟机现在可以启动了。

相关内容