虚拟机初始化时缺少 Azure 加速网络接口

虚拟机初始化时缺少 Azure 加速网络接口

我在 Azure Ubuntu 16.04 VM(包装为 VHD 映像)上运行我的应用程序。在几种情况下,我发现在 VM 启动时缺少与 SRIOV 相关的 VF 接口。需要指出的是,这个问题不是重复出现的,但它发生在我们在实验室中部署的几个 Standard_F8s_v2 实例中。

虚拟机连接了 3 个网络接口,其中 2 个设置了加速网络属性。实例正在使用 Terraform 脚本部署。

下面您可以看到串行控制台片段,其中缺少基于 dpdk net_failsafe 的“enp2s2”和“enp3s3”VF 接口,而另一次重启的附加控制台片段显示它们存在。我怀疑该问题与 Mellanox 内核模块初始化序列有关,但我收集了许多 VM 启动统计数据,无法指出这是 RCA。此外,尝试从 Azure 门户关闭/重启设备并通过 ssh 重启 VM,但仍然能够看到问题。

以下是缺失的接口片段:(可以在“ci-info:网络设备”列表下看到缺失)

*[ 1.961519] random: systemd-udevd: uninitialized urandom read (16 bytes read) [ 1.961992] random: udevadm: uninitialized urandom read (16 bytes read) [ 1.966624] random: systemd-udevd: uninitialized urandom read (16 bytes read) [ 2.013574] hv_vmbus: registering driver hv_netvsc [ 2.017652] hv_utils: Registering HyperV Utility Driver [ 2.021791] hv_vmbus: registering driver hv_util [ 2.026534] hidraw: raw HID events driver (C) Jiri Kosina [ 2.031197] hv_vmbus: registering driver hyperv_keyboard [ 2.035793] hv_vmbus: registering driver hid_hyperv [ 2.041572] hv_vmbus: registering driver hyperv_fb [ 2.048342] AVX2 version of gcm_enc/dec engaged. [ 2.052076] AES CTR mode by8 optimization enabled [ 4.344565] hv_utils: Shutdown IC version 3.0 [ 4.348061] input: AT Translated Set 2 keyboard as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/d34b2567-b9b6-42b9-8778-0a4ec0b955bf/serio2/input/input3 [ 4.358187] input: Microsoft Vmbus HID-compliant Mouse as /devices/0006:045E:0621.0001/input/input4 [ 4.364222] hid 0006:045E:0621.0001: input: <UNKNOWN> HID v0.01 Mouse [Microsoft Vmbus HID-compliant Mouse] on [ 4.371502] hyperv_fb: Screen resolution: 1152x864, Color depth: 32 [ 4.378396] Console: switching to colour frame buffer device 144x54 [ 4.383835] hv_utils: VSS IC version 5.0 [ 5.560104] hv_utils: Heartbeat IC version 3.0 Begin: Loading essential drivers ... [ 6.624015] raid6: sse2x1 gen() 11383 MB/s [ 6.672008] raid6: sse2x1 xor() 8608 MB/s [ 6.720013] raid6: sse2x2 gen() 14144 MB/s [ 6.768008] raid6: sse2x2 xor() 9824 MB/s [ 6.816009] raid6: sse2x4 gen() 15697 MB/s [ 6.864007] raid6: sse2x4 xor() 11120 MB/s [ 6.912010] raid6: avx2x1 gen() 20281 MB/s [ 6.960008] raid6: avx2x1 xor() 16369 MB/s [ 7.008009] raid6: avx2x2 gen() 25744 MB/s [ 7.056007] raid6: avx2x2 xor() 17842 MB/s [ 7.104010] raid6: avx2x4 gen() 27707 MB/s [ 7.152008] raid6: avx2x4 xor() 19589 MB/s [ 7.200010] raid6: avx512x1 gen() 25916 MB/s [ 7.248010] raid6: avx512x1 xor() 15400 MB/s [ 7.296008] raid6: avx512x2 gen() 31486 MB/s [ 7.344010] raid6: avx512x2 xor() 19259 MB/s [ 7.392009] raid6: avx512x4 gen() 32802 MB/s [ 7.440008] raid6: avx512x4 xor() 19748 MB/s [ 7.442786] raid6: using algorithm avx512x4 gen() 32802 MB/s [ 7.446455] raid6: .... xor() 19748 MB/s, rmw enabled [ 7.449751] raid6: using avx512x2 recovery algorithm [ 7.454113] xor: automatically using best checksumming function avx
[ 7.459720] async_tx: api initialized (async) done. Begin: Running /scripts/init-premount ... done. Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done. Begin: Running /scripts/local-premount ... [ 7.500972] Btrfs loaded, crc32c=crc32c-intel Scanning for Btrfs filesystems done. Warning: fsck not present, so skipping root file system [ 7.621580] EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null) done. Begin: Running /scripts/local-bottom ... done. Begin: Running /scripts/init-bottom ... done. [ 7.716053] random: crng init done [ 7.718135] random: 7 urandom warning(s) missed due to ratelimiting [ 7.754028] EXT4-fs (sda1): re-mounted. Opts: discard Cloud-init v. 19.2-24-ge7881d5c-0ubuntu1~16.04.1 running 'init-local' at Sun, 06 Oct 2019 11:06:31 +0000. Up 8.04 seconds. cloud-init-nonet[8.59]: static networking is now up Cloud-init v. 19.2-24-ge7881d5c-0ubuntu1~16.04.1 running 'init' at Sun, 06 Oct 2019 11:06:32 +0000. Up 8.80 seconds. ci-info: ++++++++++++++++++++++++++++++++++++++Net device info+++++++++++++++++++++++++++++++++++++++ ci-info: +--------+------+-----------------------------+---------------+--------+-------------------+ ci-info: | Device | Up | Address | Mask | Scope | Hw-Address | ci-info: +--------+------+-----------------------------+---------------+--------+-------------------+ ci-info: | eth0 | True | 10.29.50.32 | 255.255.255.0 | global | 00:0d:3a:38:c1:f7 | ci-info: | eth0 | True | fe80::20d:3aff:fe38:c1f7/64 | . | link | 00:0d:3a:38:c1:f7 | ci-info: | eth1 | True | 10.29.115.10 | 255.255.255.0 | global | 00:0d:3a:38:c3:06 | ci-info: | eth1 | True | fe80::20d:3aff:fe38:c306/64 | . | link | 00:0d:3a:38:c3:06 | ci-info: | eth2 | True | 10.29.211.10 | 255.255.255.0 | global | 00:0d:3a:38:c9:4a | ci-info: | lo | True | 127.0.0.1 | 255.0.0.0 | host | . | ci-info: | lo | True | ::1/128 | . | host | . | ci-info: +--------+------+-----------------------------+---------------+--------+-------------------+

显示接口的另一个片段:

[ 1.990765] random: systemd-udevd: uninitialized urandom read (16 bytes read) [ 1.991261] random: udevadm: uninitialized urandom read (16 bytes read) [ 1.995831] random: systemd-udevd: uninitialized urandom read (16 bytes read) [ 2.042179] hv_vmbus: registering driver hv_netvsc [ 2.042182] hv_utils: Registering HyperV Utility Driver [ 2.050908] hv_vmbus: registering driver hv_util [ 2.055113] hv_vmbus: registering driver hyperv_keyboard [ 2.059757] hidraw: raw HID events driver (C) Jiri Kosina [ 2.066167] hv_vmbus: registering driver hid_hyperv [ 2.073141] hv_vmbus: registering driver hyperv_fb [ 2.082086] AVX2 version of gcm_enc/dec engaged. [ 2.086060] AES CTR mode by8 optimization enabled [ 3.200601] hv_utils: Heartbeat IC version 3.0 [ 3.204903] input: AT Translated Set 2 keyboard as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/d34b2567-b9b6-42b9-8778-0a4ec0b955bf/serio2/input/input3 [ 3.219127] input: Microsoft Vmbus HID-compliant Mouse as /devices/0006:045E:0621.0001/input/input4 [ 3.226095] hid 0006:045E:0621.0001: input: <UNKNOWN> HID v0.01 Mouse [Microsoft Vmbus HID-compliant Mouse] on [ 3.234239] hyperv_fb: Screen resolution: 1152x864, Color depth: 32 [ 3.241442] Console: switching to colour frame buffer device 144x54 [ 3.388646] hv_utils: Shutdown IC version 3.0 [ 3.391537] hv_utils: VSS IC version 5.0 Begin: Loading essential drivers ... [ 4.688013] raid6: sse2x1 gen() 11377 MB/s [ 4.736006] raid6: sse2x1 xor() 8608 MB/s [ 4.784008] raid6: sse2x2 gen() 14143 MB/s [ 4.832012] raid6: sse2x2 xor() 9799 MB/s [ 4.880013] raid6: sse2x4 gen() 15731 MB/s [ 4.928012] raid6: sse2x4 xor() 11084 MB/s [ 4.976012] raid6: avx2x1 gen() 20223 MB/s [ 5.024009] raid6: avx2x1 xor() 16357 MB/s [ 5.072009] raid6: avx2x2 gen() 25798 MB/s [ 5.120011] raid6: avx2x2 xor() 17831 MB/s [ 5.168009] raid6: avx2x4 gen() 27715 MB/s [ 5.216010] raid6: avx2x4 xor() 18555 MB/s [ 5.264012] raid6: avx512x1 gen() 27142 MB/s [ 5.312011] raid6: avx512x1 xor() 15413 MB/s [ 5.360011] raid6: avx512x2 gen() 31448 MB/s [ 5.408010] raid6: avx512x2 xor() 19311 MB/s [ 5.456012] raid6: avx512x4 gen() 32843 MB/s [ 5.504011] raid6: avx512x4 xor() 20650 MB/s [ 5.506896] raid6: using algorithm avx512x4 gen() 32843 MB/s [ 5.510878] raid6: .... xor() 20650 MB/s, rmw enabled [ 5.514425] raid6: using avx512x2 recovery algorithm [ 5.519051] xor: automatically using best checksumming function avx
[ 5.525076] async_tx: api initialized (async) done. Begin: Running /scripts/init-premount ... done. Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done. Begin: Running /scripts/local-premount ... [ 5.563208] Btrfs loaded, crc32c=crc32c-intel Scanning for Btrfs filesystems done. Warning: fsck not present, so skipping root file[ 5.680481] EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null) system done. Begin: Running /scripts/local-bottom ... done. Begin: Running /scripts/init-bottom ... done. [ 5.803547] EXT4-fs (sda1): re-mounted. Opts: discard [ 5.825023] Adding 614396k swap on /var/cache/swap/swapfile. Priority:-2 extents:8 across:1114108k FS [ 5.980054] random: crng init done [ 5.982620] random: 7 urandom warning(s) missed due to ratelimiting Cloud-init v. 19.2-24-ge7881d5c-0ubuntu1~16.04.1 running 'init-local' at Sun, 06 Oct 2019 12:06:17 +0000. Up 6.10 seconds. cloud-init-nonet[6.73]: static networking is now up Cloud-init v. 19.2-24-ge7881d5c-0ubuntu1~16.04.1 running 'init' at Sun, 06 Oct 2019 12:06:18 +0000. Up 6.97 seconds. ci-info: ++++++++++++++++++++++++++++++++++++++Net device info+++++++++++++++++++++++++++++++++++++++ ci-info: +--------+------+-----------------------------+---------------+--------+-------------------+ ci-info: | Device | Up | Address | Mask | Scope | Hw-Address | ci-info: +--------+------+-----------------------------+---------------+--------+-------------------+ ci-info: | **enp2s2** | **True** | . | . | . | 00:0d:3a:38:c3:06 | ci-info: | **enp3s3** | **True** | . | . | . | 00:0d:3a:38:c9:4a | ci-info: | eth0 | True | 10.29.50.32 | 255.255.255.0 | global | 00:0d:3a:38:c1:f7 | ci-info: | eth1 | True | 10.29.115.10 | 255.255.255.0 | global | 00:0d:3a:38:c3:06 | ci-info: | eth1 | True | fe80::20d:3aff:fe38:c306/64 | . | link | 00:0d:3a:38:c3:06 | ci-info: | eth2 | True | 10.29.211.10 | 255.255.255.0 | global | 00:0d:3a:38:c9:4a | ci-info: | eth2 | True | fe80::20d:3aff:fe38:c94a/64 | . | link | 00:0d:3a:38:c9:4a | ci-info: | lo | True | 127.0.0.1 | 255.0.0.0 | host | . | ci-info: | lo | True | ::1/128 | . | host | . | ci-info: +--------+------+-----------------------------+---------------+--------+-------------------+

相关内容