作为 VSAN beta 测试人员,我决定按照 VMWare 针对生产站点的建议升级到 GA 版本。他们说,从 beta 版本升级到 GA 版本是不可能的/不支持的,我没有升级,而是完全擦除/重新安装了 ESX 主机。然而在安装过程中,我发现系统非常慢,安装程序在几个小时内启动,然后所有系统扫描操作每次大约需要 30-40 分钟。安装的系统总是卡在
usbarbitrator 启动
信息。
我启用了串行控制台的日志记录功能,以下是我看到的消息:
2014-03-31T20:00:54.517Z cpu2:33262)LSOMCommon: LSOM_RegisterDiskAttrHandle:99: t10.ATA_____WDC_WD2000FYYZ2D01UL1B0_______________________WD2DWCC1P0395710 is a SATA disk
2014-03-31T20:00:54.532Z cpu2:33262)LSOMCommon: LSOM_RegisterDiskAttrHandle:103: DiskAttrHandle:0x4111c977b928 is added to disk:t10.ATA_____WDC_WD2000FYYZ2D01UL1B0_______________________WD2DWCC1P0395710 by module:plog
2014-03-31T20:00:54.551Z cpu2:33262)PLOG: PLOG_InitMDDevice:830: Registered diskAttrHandle:0x4111c977b928 on disk t10.ATA_____WDC_WD2000FYYZ2D01UL1B0_______________________WD2DWCC1P0395710
2014-03-31T20:00:54.568Z cpu2:33262)PLOG: PLOG_AllocOneRDT:539: You're wasting 524288 bytes by not requesting a length that is not a multiple of the allocation granularity 1048576
2014-03-31T20:00:54.583Z cpu2:33262)PLOG: PLOG_InitElevator:1782: Initializing PLOG Elevator UUID 5287745f-e1c5-269f-ce67-c8d8d4c03967
2014-03-31T20:00:54.595Z cpu2:33262)LSOMCommon: LSOMSetWCEnableSATA:1071: SATA disk t10.ATA_____WDC_WD2000FYYZ2D01UL1B0_______________________WD2DWCC1P0395710 disabling cache...
2014-03-31T20:00:54.611Z cpu2:33262)PLOG: PLOG_InitElevator:1845: Initializing PLOG Elevator UUID on device t10.ATA_____WDC_WD2000FYYZ2D01UL1B0_______________________WD2DWCC1P0395710:2 5287745f-e1c5-269f-ce67-c8d8d4c03967
2014-03-31T20:00:54.630Z cpu2:33262)PLOG: PLOG_InitMDDevice:843: PLOG device t10.ATA_____WDC_WD2000FYYZ2D01UL1B0_______________________WD2DWCC1P0395710:2 is initialized with device handles
2014-03-31T20:01:24.648Z cpu1:32798)NMP: nmp_ThrottleLogForDevice:2321: Cmd 0x28 (0x4136804461c0, 0) to dev "t10.ATA_____WDC_WD2000FYYZ2D01UL1B0_______________________WD2DWCC1P0395710" on path "vmhba37:C0:T0:L0" Failed: H:0x5 D:0x0 P:0x0 Possible sense $
2014-03-31T20:01:24.670Z cpu1:32798)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "t10.ATA_____WDC_WD2000FYYZ2D01UL1B0_______________________WD2DWCC1P0395710" state in doubt; requested fast path state update...
2014-03-31T20:01:24.691Z cpu1:32798)ScsiDeviceIO: 2337: Cmd(0x4136804461c0) 0x28, CmdSN 0x1 from world 0 to dev "t10.ATA_____WDC_WD2000FYYZ2D01UL1B0_______________________WD2DWCC1P0395710" failed H:0x5 D:0x0 P:0x0 Possible sense data: 0x5 0x24 0x0.
2014-03-31T20:01:24.713Z cpu1:32798)LSOMCommon: IORETRYCompleteIO:389: Throttled: 0x4136c8c6af00 IO type 264 (READ) isOdered:NO since 30065 msec status Maximum kernel-level retries exceeded
2014-03-31T20:01:24.729Z cpu9:33541)WARNING: LSOM: LSOMEventNotify:4570: VSAN device 5287745f-e1c5-269f-ce67-c8d8d4c03967 is under permanent error.
2014-03-31T20:01:24.743Z cpu9:33541)WARNING: LSOM: LSOMPostDiskEvent:2114: Unable to post disk event for 5287745f-e1c5-269f-ce67-c8d8d4c03967: Not ready
2014-03-31T20:01:24.757Z cpu9:33541)LSOM: LSOMPublishDisk:1959: Throttled: Unable to post disk event for 5287745f-e1c5-269f-ce67-c8d8d4c03967: Not ready
2014-03-31T20:01:54.774Z cpu1:32798)NMP: nmp_ThrottleLogForDevice:2321: Cmd 0x28 (0x413680441bc0, 0) to dev "t10.ATA_____WDC_WD2000FYYZ2D01UL1B0_______________________WD2DWCC1P0395710" on path "vmhba37:C0:T0:L0" Failed: H:0x5 D:0x0 P:0x0 Possible sense $
2014-03-31T20:01:54.797Z cpu1:32798)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "t10.ATA_____WDC_WD2000FYYZ2D01UL1B0_______________________WD2DWCC1P0395710" state in doubt; requested fast path state update...
2014-03-31T20:01:54.817Z cpu1:32798)ScsiDeviceIO: 2337: Cmd(0x413680441bc0) 0x28, CmdSN 0x2 from world 0 to dev "t10.ATA_____WDC_WD2000FYYZ2D01UL1B0_______________________WD2DWCC1P0395710" failed H:0x5 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
2014-03-31T20:01:54.839Z cpu1:32798)LSOMCommon: IORETRYCompleteIO:389: Throttled: 0x4136c8c6ae40 IO type 264 (READ) isOdered:NO since 30063 msec status Maximum kernel-level retries exceeded
2014-03-31T20:02:05.014Z cpu15:32958)VMW_SATP_LOCAL: satp_local_updatePathStates:458: Failed to update path "vmhba37:C0:T0:L0" state. Status=Transient storage condition, suggest retry
2014-03-31T20:02:19.017Z cpu1:32798)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "t10.ATA_____WDC_WD2000FYYZ2D01UL1B0_______________________WD2DWCC1P0395710" state in doubt; requested fast path state update...
2014-03-31T20:02:19.038Z cpu1:32798)ScsiDeviceIO: 2337: Cmd(0x413680444b40) 0x12, CmdSN 0x318 from world 0 to dev "t10.ATA_____WDC_WD2000FYYZ2D01UL1B0_______________________WD2DWCC1P0395710" failed H:0x5 D:0x0 P:0x0 Possible sense data: 0x2 0x3a 0x0.
2014-03-31T20:02:24.857Z cpu1:32798)NMP: nmp_ThrottleLogForDevice:2321: Cmd 0x28 (0x4136804411c0, 0) to dev "t10.ATA_____WDC_WD2000FYYZ2D01UL1B0_______________________WD2DWCC1P0395710" on path "vmhba37:C0:T0:L0" Failed: H:0x5 D:0x0 P:0x0 Possible sense $
2014-03-31T20:02:24.879Z cpu1:32798)ScsiDeviceIO: 2337: Cmd(0x4136804411c0) 0x28, CmdSN 0x3 from world 0 to dev "t10.ATA_____WDC_WD2000FYYZ2D01UL1B0_______________________WD2DWCC1P0395710" failed H:0x5 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
如果我取出所有磁盘,我看不到它们。我验证了所有磁盘都是可读的,没有错误,没有坏块等。我知道我的服务器可能不在 HCL 中,但 Beta 版本运行良好,只有 GA 有这个问题。
答案1
为了将其标记为已回答,我在此重复上面的评论作为答案:
我找到了问题及其解决方案,这很奇怪,但以下操作有所帮助:在安装 ESXi 之前,我从 Linux Live CD 启动并检查了所有磁盘。如果对磁盘进行了完整的读/写测试,则在安装过程中没有出现错误。所以我去擦除了所有驱动器,安装顺利进行。在我看来,VSAN 开始使用一些不同的机制或数据标签,并且驱动器上残留了旧信息。我没有在任何地方找到有关此错误的任何信息,因此我将其留在这里,供有同样问题的人参考。