为什么使用 ceph 磁盘的 ovirt VM 处于“等待启动”状态

为什么使用 ceph 磁盘的 ovirt VM 处于“等待启动”状态

我的设置包括 ceph mimic(centos 7,使用 ceph-ansible 设置)、pike 版本上的 cinder/keystone 组合和 ovirt 4.2.5.1。

外部 Cinder 提供程序已设置,我可以创建磁盘。

创建虚拟机并启动时,虚拟机在 ovirt 仪表板中显示为“等待启动”

在应该运行 VM 的 ovirt 节点上,我检查了 libvirt:

# virsh --readonly list

 Id    Name                           State
----------------------------------------------------
14    testceph                       paused

检查域配置似乎也没问题...最重要的是 ceph mons 在磁盘配置中给出。

# virsh --readonly dumpxml

<domain type='kvm' id='15'>
  <name>testceph</name>
  <uuid>036a2385-2b4f-48f9-bcf9-8f2882ecde36</uuid>
  <metadata xmlns:ns0="http://ovirt.org/vm/tune/1.0" xmlns:ovirt-vm="http://ovirt.org/vm/1.0">
    <ns0:qos/>
    <ovirt-vm:vm xmlns:ovirt-vm="http://ovirt.org/vm/1.0">
    <ovirt-vm:clusterVersion>4.2</ovirt-vm:clusterVersion>
    <ovirt-vm:destroy_on_reboot type="bool">False</ovirt-vm:destroy_on_reboot>
    <ovirt-vm:launchPaused>false</ovirt-vm:launchPaused>
    <ovirt-vm:memGuaranteedSize type="int">2730</ovirt-vm:memGuaranteedSize>
    <ovirt-vm:minGuaranteedMemoryMb type="int">2730</ovirt-vm:minGuaranteedMemoryMb>
    <ovirt-vm:resumeBehavior>auto_resume</ovirt-vm:resumeBehavior>
    <ovirt-vm:startTime type="float">1535016868.02</ovirt-vm:startTime>
    <ovirt-vm:device mac_address="00:1a:4a:16:01:78">
        <ovirt-vm:specParams/>
        <ovirt-vm:vm_custom/>
    </ovirt-vm:device>
</ovirt-vm:vm>
  </metadata>
  <maxMemory slots='16' unit='KiB'>16777216</maxMemory>
  <memory unit='KiB'>4194304</memory>
  <currentMemory unit='KiB'>4194304</currentMemory>
  <vcpu placement='static' current='1'>16</vcpu>
  <iothreads>1</iothreads>
  <resource>
    <partition>/machine</partition>
  </resource>
  <sysinfo type='smbios'>
    <system>
      <entry name='manufacturer'>oVirt</entry>
      <entry name='product'>oVirt Node</entry>
      <entry name='version'>7-5.1804.el7.centos.2</entry>
      <entry name='serial'>49434D53-0200-9031-2500-31902500FB7F</entry>
      <entry name='uuid'>036a2385-2b4f-48f9-bcf9-8f2882ecde36</entry>
    </system>
  </sysinfo>
  <os>
    <type arch='x86_64' machine='pc-i440fx-rhel7.3.0'>hvm</type>
    <smbios mode='sysinfo'/>
  </os>
  <features>
    <acpi/>
  </features>
  <cpu mode='custom' match='exact' check='partial'>
    <model fallback='forbid'>Nehalem</model>
    <topology sockets='16' cores='1' threads='1'/>
    <numa>
      <cell id='0' cpus='0' memory='4194304' unit='KiB'/>
    </numa>
  </cpu>
  <clock offset='variable' adjustment='0' basis='utc'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <pm>
    <suspend-to-mem enabled='no'/>
    <suspend-to-disk enabled='no'/>
  </pm>
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw' error_policy='report'/>
      <source file='/rhev/data-center/mnt/192.168.10.6:_media_ovirt-cd-images/c394242c-81ae-4d6a-a193-65157cc84702/images/11111111-1111-1111-1111-111111111111/ubuntu-server-18.04.iso' startupPolicy='optional'/>
      <backingStore/>
      <target dev='hdc' bus='ide'/>
      <readonly/>
      <boot order='2'/>
      <alias name='ua-fcca0dff-d833-4f28-b782-78ce0b016afe'/>
      <address type='drive' controller='0' bus='1' target='0' unit='0'/>
    </disk>
    <disk type='network' device='disk' snapshot='no'>
      <driver name='qemu' type='raw' cache='none' error_policy='stop' io='threads'/>
      <auth username='cinder'>
        <secret type='ceph' uuid='c6020051-6cd3-4ddf-982e-3d94c080de9c'/>
      </auth>
      <source protocol='rbd' name='volumes/volume-9b95b28c-9eec-4110-9973-88c161d3503f'>
        <host name='192.168.20.21' port='6789'/>
        <host name='192.168.20.22' port='6789'/>
        <host name='192.168.20.23' port='6789'/>
      </source>
      <target dev='sda' bus='scsi'/>
      <serial>9b95b28c-9eec-4110-9973-88c161d3503f</serial>
      <boot order='1'/>
      <alias name='ua-9b95b28c-9eec-4110-9973-88c161d3503f'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    <controller type='usb' index='0' model='piix3-uhci'>
      <alias name='usb'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'>
      <alias name='pci.0'/>
    </controller>
    <controller type='scsi' index='0'>
      <alias name='scsi0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </controller>
    <controller type='ide' index='0'>
      <alias name='ide'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <alias name='virtio-serial0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </controller>
    <interface type='bridge'>
      <mac address='00:1a:4a:16:01:78'/>
      <source bridge='ovirtmgmt'/>
      <target dev='vnet0'/>
      <model type='virtio'/>
      <filterref filter='vdsm-no-mac-spoofing'/>
      <link state='up'/>
      <mtu size='1500'/>
      <alias name='ua-446090cf-6758-4b3d-bd87-eb8b61442a46'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
    <channel type='unix'>
      <source mode='bind' path='/var/lib/libvirt/qemu/channels/036a2385-2b4f-48f9-bcf9-8f2882ecde36.ovirt-guest-agent.0'/>
      <target type='virtio' name='ovirt-guest-agent.0'/>
      <alias name='channel0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <channel type='unix'>
      <source mode='bind' path='/var/lib/libvirt/qemu/channels/036a2385-2b4f-48f9-bcf9-8f2882ecde36.org.qemu.guest_agent.0'/>
      <target type='virtio' name='org.qemu.guest_agent.0'/>
      <alias name='channel1'/>
      <address type='virtio-serial' controller='0' bus='0' port='2'/>
    </channel>
    <channel type='spicevmc'>
      <target type='virtio' name='com.redhat.spice.0'/>
      <alias name='channel2'/>
      <address type='virtio-serial' controller='0' bus='0' port='3'/>
    </channel>
    <input type='mouse' bus='ps2'>
      <alias name='input0'/>
    </input>
    <input type='keyboard' bus='ps2'>
      <alias name='input1'/>
    </input>
    <graphics type='spice' port='5900' tlsPort='5901' autoport='yes' listen='192.168.10.11' passwdValidTo='1970-01-01T00:00:01'>
      <listen type='network' address='192.168.10.11' network='vdsm-ovirtmgmt'/>
      <channel name='main' mode='secure'/>
      <channel name='display' mode='secure'/>
      <channel name='inputs' mode='secure'/>
      <channel name='cursor' mode='secure'/>
      <channel name='playback' mode='secure'/>
      <channel name='record' mode='secure'/>
      <channel name='smartcard' mode='secure'/>
      <channel name='usbredir' mode='secure'/>
    </graphics>
    <graphics type='vnc' port='5902' autoport='yes' listen='192.168.10.11' keymap='en-us' passwdValidTo='1970-01-01T00:00:01'>
      <listen type='network' address='192.168.10.11' network='vdsm-ovirtmgmt'/>
    </graphics>
    <video>
      <model type='qxl' ram='65536' vram='32768' vgamem='16384' heads='1' primary='yes'/>
      <alias name='ua-31537f3a-f1cf-4269-839c-bc82721ff7f3'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>
    <memballoon model='virtio'>
      <stats period='5'/>
      <alias name='ua-beeefd9c-c2bd-4836-83f0-a28657219b3e'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </memballoon>
    <rng model='virtio'>
      <backend model='random'>/dev/urandom</backend>
      <alias name='ua-7d25e1b5-4d03-4bc3-80f6-83a80d69b391'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </rng>
  </devices>
  <seclabel type='dynamic' model='selinux' relabel='yes'>
    <label>system_u:system_r:svirt_t:s0:c565,c625</label>
    <imagelabel>system_u:object_r:svirt_image_t:s0:c565,c625</imagelabel>
  </seclabel>
  <seclabel type='dynamic' model='dac' relabel='yes'>
    <label>+107:+107</label>
    <imagelabel>+107:+107</imagelabel>
  </seclabel>
</domain>

是的。我可以看到 ovirt 节点和 ceph 节点相互通信。但 ovirt 节点只与 ceph 监视器通信。没有 OSD 参与。两个节点都可以相互通信,允许使用巨型帧并且运行正常。在此输出中,ceph 监视器为 192.168.20.21,ovirt 节点为 192.168.20.11

[root@ceph1 ~]# tcpflow -c -i enp3s0 src or dst host 192.168.20.11
tcpflow: listening on enp3s0
192.168.020.021.06789-192.168.020.011.44734: ceph v027
192.168.020.011.44734-192.168.020.021.06789: ceph v027
192.168.020.011.44734-192.168.020.021.06789: *D
192.168.020.011.44734-192.168.020.021.06789:
192.168.020.021.06789-192.168.020.011.44734:
192.168.020.021.06789-192.168.020.011.44736: ceph v027
192.168.020.011.44736-192.168.020.021.06789: ceph v027
192.168.020.011.44736-192.168.020.021.06789: *D
192.168.020.021.06789-192.168.020.011.44736:
192.168.020.021.06789-192.168.020.011.44738: ceph v027
192.168.020.011.44738-192.168.020.021.06789: ceph v027
192.168.020.011.44738-192.168.020.021.06789: *D!
192.168.020.021.06789-192.168.020.011.44738:
192.168.020.021.06789-192.168.020.011.44740: ceph v027
192.168.020.011.44740-192.168.020.021.06789: ceph v027
192.168.020.011.44740-192.168.020.021.06789: *D"
192.168.020.021.06789-192.168.020.011.44740:
192.168.020.021.06789-192.168.020.011.44742: ceph v027
192.168.020.011.44742-192.168.020.021.06789: ceph v027
192.168.020.011.44742-192.168.020.021.06789: *D#
192.168.020.021.06789-192.168.020.011.44742:
192.168.020.021.06789-192.168.020.011.44754: ceph v027
192.168.020.011.44754-192.168.020.021.06789: ceph v027
192.168.020.011.44754-192.168.020.021.06789: *D)
192.168.020.021.06789-192.168.020.011.44754:

这种情况一直持续,直到 ovirt 节点上的 qemu 日志显示:

2018-08-23T09:34:31.233493Z qemu-kvm: -drive file=rbd:volumes/volume-9b95b28c-9eec-4110-9973-88c161d3503f:id=cinder:auth_supported=cephx\;none:mon_host=192.168.20.21\:6789\;192.168.20.22\:6789\;192.168.20.23\:6789,file.password-secret=ua-9b95b28c-9eec-4110-9973-88c161d3503f-secret0,format=raw,if=none,id=drive-ua-9b95b28c-9eec-4110-9973-88c161d3503f,serial=9b95b28c-9eec-4110-9973-88c161d3503f,cache=none,werror=stop,rerror=stop,aio=threads: 'serial' is deprecated, please use the corresponding option of '-device' instead
2018-08-23T09:39:31.281126Z qemu-kvm: -drive file=rbd:volumes/volume-9b95b28c-9eec-4110-9973-88c161d3503f:id=cinder:auth_supported=cephx\;none:mon_host=192.168.20.21\:6789\;192.168.20.22\:6789\;192.168.20.23\:6789,file.password-secret=ua-9b95b28c-9eec-4110-9973-88c161d3503f-secret0,format=raw,if=none,id=drive-ua-9b95b28c-9eec-4110-9973-88c161d3503f,serial=9b95b28c-9eec-4110-9973-88c161d3503f,cache=none,werror=stop,rerror=stop,aio=threads: 
error connecting: Connection timed out
2018-08-23 09:39:31.291+0000: shutting down, reason=failed

那么是什么让 ovirt 和 ceph 无法通信呢?通过 cinder 引入图像已经完成,但不知何故 ovirt 无法联系 ceph osds...

答案1

centos 7 附带的 librbd1 库似乎存在问题,而 centos 7 是 oVirt 4.2.x 的基础。它太旧了,无法与 ceph V13.x(又名 mimic)以及可能还有 V12 luminous 配合使用。

这个帖子来自 oVirt 表单,用于有关此问题的讨论。

有一种方法升级库。

相关内容