ESXi 7.0 安装 NFSv4 数据存储无法配置 OVA,NFSv3 运行正常

ESXi 7.0 安装 NFSv4 数据存储无法配置 OVA,NFSv3 运行正常

我们有一个使用 ZFS 的 Ubuntu 20.04 主机和以下sharenfs选项:

root@host:~# zfs get sharenfs pool/enc/esxi
NAME         PROPERTY  VALUE                                                                              SOURCE
pool/enc/esxi  sharenfs  rw=x.x.x.x,no_subtree_check,async,anonuid=0,anongid=0,all_squash  local

root@host:~# exportfs -v | grep esxi
/pool/enc/esxi    x.x.x.x(rw,async,wdelay,root_squash,all_squash,no_subtree_check,mountpoint,anonuid=0,anongid=0,sec=sys,rw,secure,root_squash,all_squash)

尝试使用 OVA 创建新虚拟机时,操作失败:

Web UI 显示“无法部署 VM:postNFCData 失败:”。它没有开始上传磁盘,似乎在创建阶段就失败了

vmkernel.log 显示:

2022-05-04T09:33:29.859Z cpu7:1051648 opID=85e2477a)NFS41: NFS41_VSIMountSet:405: Mount server: nfshost, port: 2049, path: /pool/enc/esxi, label: NFS, security: 1 user: , options: <none>
2022-05-04T09:33:29.859Z cpu7:1051648 opID=85e2477a)StorageApdHandler: 966: APD Handle  Created with lock[StorageApd-0x4313e6003970]
2022-05-04T09:33:29.859Z cpu7:1051648 opID=85e2477a)NFS41: NFS41_ConnectionLookup:804: Created new connection for address tcp nfshost.8.1
2022-05-04T09:33:29.860Z cpu10:1049211)NFS41: NFS41ProcessExidResult:2314: clientid 4f2a53628e14edb1 roles 0x20000
2022-05-04T09:33:29.860Z cpu10:1049213)NFS41: NFS41ProcessSessionUp:2380: Cluster 0x4313e6004a40[2] clidValid:0 clusterAPDState:0 received clientID 4f2a53628e14edb1
2022-05-04T09:33:29.860Z cpu10:1049213)NFS41: NFS41ProcessSessionUp:2393: Cluster 0x4313e6004a40[2] set with new valid clientID 4f2a53628e14edb1
2022-05-04T09:33:29.860Z cpu10:1049213)NFS41: NFS41ProcessClusterProbeResult:4186: Reclaiming state, cluster 0x4313e6004a40 [2]
2022-05-04T09:33:29.872Z cpu7:1051648 opID=85e2477a)NFS41: NFS41FSCompleteMount:3966: Lease time: 90
2022-05-04T09:33:29.872Z cpu7:1051648 opID=85e2477a)NFS41: NFS41FSCompleteMount:3967: Max read xfer size: 0x3fc00
2022-05-04T09:33:29.872Z cpu7:1051648 opID=85e2477a)NFS41: NFS41FSCompleteMount:3968: Max write xfer size: 0x3fc00
2022-05-04T09:33:29.872Z cpu7:1051648 opID=85e2477a)NFS41: NFS41FSCompleteMount:3969: Max file size: 0x7fffffffffffffff
2022-05-04T09:33:29.872Z cpu7:1051648 opID=85e2477a)NFS41: NFS41FSCompleteMount:3970: Max file name: 255
2022-05-04T09:33:29.872Z cpu7:1051648 opID=85e2477a)WARNING: NFS41: NFS41FSCompleteMount:3975: The max file name size (255) of file system is larger than that of FSS (128)
2022-05-04T09:33:29.873Z cpu7:1051648 opID=85e2477a)NFS41: NFS41FSAPDNotify:6188: Restored connection to the server nfshost mount point NFS, mounted as 507f1811-40137e33-0000-000000000000 ("/pool/enc/esxi")
2022-05-04T09:33:29.873Z cpu7:1051648 opID=85e2477a)NFS41: NFS41_VSIMountSet:417: NFS mounted successfully
2022-05-04T09:35:05.436Z cpu3:1048746)StorageDevice: 7059: End path evaluation for device t10.NVMe____WDC_CL_SN720_XXXXXXXXXXXXXXXXX__________XXXXXX448XXXXXXX
2022-05-04T09:35:05.437Z cpu3:1048746)StorageDevice: 7059: End path evaluation for device t10.NVMe____WDC_CL_SN720_XXXXXXXXXXXXXXXXX__________XXXXXX448XXXXXXX
2022-05-04T09:38:18.355Z cpu3:1051646 opID=a65fad89)World: 12075: VC opID esxui-8e02-4c35 maps to vmkernel opID a65fad89
2022-05-04T09:38:18.355Z cpu3:1051646 opID=a65fad89)WARNING: NFS41: NFS41FileDoCloseFile:3128: file handle close on obj 0x4305bc5cad10 failed: Stale file handle
2022-05-04T09:38:18.355Z cpu3:1051646 opID=a65fad89)WARNING: NFS41: NFS41FileOpCloseFile:3718: NFS41FileCloseFile failed: Stale file handle
2022-05-04T09:38:18.411Z cpu3:1051646 opID=a65fad89)WARNING: NFS41: NFS41FileDoCloseFile:3128: file handle close on obj 0x4305bc5aef70 failed: Stale file handle
2022-05-04T09:38:18.411Z cpu3:1051646 opID=a65fad89)WARNING: NFS41: NFS41FileOpCloseFile:3718: NFS41FileCloseFile failed: Stale file handle
2022-05-04T09:38:19.909Z cpu1:1054212 opID=6d39243b)World: 12075: VC opID esxui-e417-4c55 maps to vmkernel opID 6d39243b
2022-05-04T09:38:19.909Z cpu1:1054212 opID=6d39243b)VmMemXfer: vm 1054212: 2465: Evicting VM with path:/vmfs/volumes/507f1811-40137e33-0000-000000000000/x/x.vmx
2022-05-04T09:38:19.909Z cpu1:1054212 opID=6d39243b)VmMemXfer: 209: Creating crypto hash
2022-05-04T09:38:19.909Z cpu1:1054212 opID=6d39243b)VmMemXfer: vm 1054212: 2479: Could not find MemXferFS region for /vmfs/volumes/507f1811-40137e33-0000-000000000000/x/x.vmx
2022-05-04T09:38:19.929Z cpu1:1054212 opID=6d39243b)VmMemXfer: vm 1054212: 2465: Evicting VM with path:/vmfs/volumes/507f1811-40137e33-0000-000000000000/x/x.vmx
2022-05-04T09:38:19.929Z cpu1:1054212 opID=6d39243b)VmMemXfer: 209: Creating crypto hash
2022-05-04T09:38:19.930Z cpu1:1054212 opID=6d39243b)VmMemXfer: vm 1054212: 2479: Could not find MemXferFS region for /vmfs/volumes/507f1811-40137e33-0000-000000000000/x/x.vmx

其余一切正常,系统已经在 NFS 上运行多个虚拟机一段时间了,没有任何问题。我们可以通过配置到本地非 NFS 数据存储,然后将生成的虚拟机从本地数据存储复制到 NFS 数据存储来解决 OVA 故障问题,然后它就可以顺利启动了。

无论如何,我想尝试找出根本原因。

到目前为止我已经尝试过(每次都重新启动 ESXi):

  • 将 NFS 共享设置syncasync
  • 将 NFS 共享设置no_wdelaywdelay
  • 以上组合

没有人能解决这个问题。

然后我尝试删除 NFS 数据存储区并重新添加它,但选择 NFS v3 并尝试配置 OVA。它工作正常,我等了一会儿,OVA 上传完成,它成功了,然后新的 VM 也启动正常了!

我重新启动了 ESXi 以验证这不是一个侥幸并且 OVA 配置仍然有效。

然后,我删除了 NFS 数据存储并重新添加它,这次像以前一样选择了 v4,问题又出现了。

因此,无论出于什么原因,它似乎在 NFSv3 上运行良好,但在 NFSv4 上运行不正常......

如何使 OVA 配置在 NFSv4 上工作,就像在 v3 ESXi 数据存储上一样?

相关内容