如果我理解正确的话,以下核心转储意味着 cpu4 已经导致主机崩溃
如果我读下一行,似乎当时 CPU 4 被分配给了 NexentaStore Vm...所以如果我没记错的话,我可以说 NexentaStor Vm 导致我的 esxi 崩溃
我对吗 ?
那个核心转储能给我提供更多信息吗?
2012-11-14T03:48:01.046Z cpu4:6089)0x41221f25ba08:[0x41803007abff]PanicvPanicInt@vmkernel#nover+0x56 stack: 0x3000000008, 0x41221f25ba
2012-11-14T03:48:01.046Z cpu4:6089)0x41221f25bae8:[0x41803007b4a7]Panic@vmkernel#nover+0xae stack: 0x2e067c00000010, 0x0, 0x1f25bb38,
2012-11-14T03:48:01.047Z cpu4:6089)0x41221f25bc18:[0x4180300a7823]TLBDoInvalidate@vmkernel#nover+0x45a stack: 0xca, 0x0, 0x0, 0x0, 0x0
2012-11-14T03:48:01.047Z cpu4:6089)0x41221f25bc68:[0x418030489e17]UserMem_CartelFlush@<None>#<None>+0xce stack: 0xcaa0b, 0x0, 0x0, 0x4
2012-11-14T03:48:01.047Z cpu4:6089)0x41221f25bd78:[0x41803048ab91]UserMemUnmapStateCleanup@<None>#<None>+0x58 stack: 0x0, 0x41221f25bd
2012-11-14T03:48:01.047Z cpu4:6089)0x41221f25be58:[0x41803048b97d]UserMemUnmap@<None>#<None>+0x104 stack: 0x41221f267000, 0x41221f25bf
2012-11-14T03:48:01.048Z cpu4:6089)0x41221f25be98:[0x41803048bf20]UserMem_Unmap@<None>#<None>+0xe3 stack: 0x426, 0x0, 0x41221f25bef8,
2012-11-14T03:48:01.048Z cpu4:6089)0x41221f25beb8:[0x4180304a5985]UW64VMKSyscallUnpackReleasePhysMemMap@<None>#<None>+0x18 stack: 0x10
2012-11-14T03:48:01.048Z cpu4:6089)0x41221f25bef8:[0x418030476791]User_LinuxSyscallHandler@<None>#<None>+0x17c stack: 0x41803004cc70,
2012-11-14T03:48:01.048Z cpu4:6089)0x41221f25bf18:[0x4180300a82be]User_LinuxSyscallHandler@vmkernel#nover+0x19 stack: 0x3ffe63bed80, 0
2012-11-14T03:48:01.049Z cpu4:6089)0x41221f25bf28:[0x418030110064]gate_entry@vmkernel#nover+0x63 stack: 0x10b, 0x0, 0x0, 0x426, 0xcf76
2012-11-14T03:48:01.049Z cpu4:6089)VMware ESXi 5.1.0 [Releasebuild-799733 x86_64]
PCPU 1 locked up. Failed to ack TLB invalidate (total of 1 locked up, PCPU(s): 1).
2012-11-14T03:48:01.050Z cpu4:6089)cr0=0x80010031 cr2=0xcaa0b750 cr3=0x197d7b000 cr4=0x42768
2012-11-14T03:48:01.050Z cpu4:6089)pcpu:0 world:6111 name:"vmm0:Windows_2012_-_SQL" (V)
2012-11-14T03:48:01.050Z cpu4:6089)pcpu:1 world:6032 name:"vmm0:Windows_2012_-_AD" (V)
2012-11-14T03:48:01.050Z cpu4:6089)pcpu:2 world:6098 name:"vmm0:Windows_2012_-_App" (V)
2012-11-14T03:48:01.050Z cpu4:6089)pcpu:3 world:4099 name:"idle3" (IS)
2012-11-14T03:48:01.050Z cpu4:6089)pcpu:4 world:6089 name:"vmx-vcpu-0:NexentaStor" (U)
2012-11-14T03:48:01.050Z cpu4:6089)pcpu:5 world:6134 name:"vmm0:Ubuntu_-_NGINX" (V)
2012-11-14T03:48:01.050Z cpu4:6089)pcpu:6 world:4102 name:"idle6" (IS)
2012-11-14T03:48:01.050Z cpu4:6089)pcpu:7 world:4103 name:"idle7" (IS)
2012-11-14T03:48:01.050Z cpu4:6089)@BlueScreen: PCPU 1 locked up. Failed to ack TLB invalidate (total of 1 locked up, PCPU(s): 1).
编辑
它运行了 2.5 天,但后来崩溃并出现错误
2012-11-16T16:15:35.233Z cpu6:4102)World: 8381: PRDA 0x418041800000 ss 0x0 ds 0x4018 es 0x4018 fs 0x4018 gs 0x4018
2012-11-16T16:15:35.233Z cpu6:4102)World: 8383: TR 0x4020 GDT 0x4122001a1000 (0x402f) IDT 0x41800b112000 (0xfff)
2012-11-16T16:15:35.233Z cpu6:4102)World: 8384: CR0 0x80010031 CR3 0x125f24000 CR4 0x42768
2012-11-16T16:15:35.238Z cpu6:4102)Backtrace for current CPU #6, worldID=4102, ebp=0x41220019bc10
2012-11-16T16:15:35.239Z cpu6:4102)0x41220019bc10:[0x41800b052105]IRQ_DoInterrupt@vmkernel#nover+0x5c stack: 0x0, 0x418041800180, 0x0,
2012-11-16T16:15:35.239Z cpu6:4102)0x41220019bc50:[0x41800b04bd92]IDT_IntrHandler@vmkernel#nover+0x139 stack: 0x41220019bd68, 0x41800b
2012-11-16T16:15:35.239Z cpu6:4102)0x41220019bc60:[0x41800b110064]gate_entry@vmkernel#nover+0x63 stack: 0x4018, 0x4018, 0x0, 0x0, 0x0
2012-11-16T16:15:35.240Z cpu6:4102)0x41220019bd68:[0x41800b2dbd6f]Power_HaltPCPU@vmkernel#nover+0x276 stack: 0x41220019be68, 0x4122001
2012-11-16T16:15:35.240Z cpu6:4102)0x41220019be68:[0x41800b1bd114]CpuSchedIdleLoopInt@vmkernel#nover+0x873 stack: 0x41220019be98, 0x41
2012-11-16T16:15:35.240Z cpu6:4102)0x41220019be78:[0x41800b1c66ae]CpuSched_IdleLoop@vmkernel#nover+0x15 stack: 0x6, 0x6, 0x41220019bfe
2012-11-16T16:15:35.241Z cpu6:4102)0x41220019be98:[0x41800b04f6ce]Init_SlaveIdle@vmkernel#nover+0x49 stack: 0x0, 0x0, 0x0, 0x0, 0x0
2012-11-16T16:15:35.241Z cpu6:4102)0x41220019bfe8:[0x41800b2e1f86]SMPSlaveIdle@vmkernel#nover+0x31d stack: 0x0, 0x0, 0x0, 0x0, 0x0
2012-11-16T16:15:35.241Z cpu6:4102)VMware ESXi 5.1.0 [Releasebuild-799733 x86_64]
#PF Exception 14 in world 4102:idle6 IP 0x41800b052105 addr 0x417fd1837b01
2012-11-16T16:15:35.242Z cpu6:4102)cr0=0x8001003d cr2=0x417fd1837b01 cr3=0xcdff6000 cr4=0x216c
2012-11-16T16:15:35.242Z cpu6:4102)frame=0x41220019bae0 ip=0x41800b052105 err=0 rflags=0x10006
2012-11-16T16:15:35.242Z cpu6:4102)rax=0x66f1400 rbx=0x41220019bc50 rcx=0x41800b2dbd6f
2012-11-16T16:15:35.242Z cpu6:4102)rdx=0x417fcb146700 rbp=0x41220019bc10 rsi=0x41220019bc70
2012-11-16T16:15:35.242Z cpu6:4102)rdi=0x19bc50 r8=0x4100018d29b0 r9=0x4ca88b
2012-11-16T16:15:35.242Z cpu6:4102)r10=0xdf r11=0x1 r12=0x4122001a7000
2012-11-16T16:15:35.242Z cpu6:4102)r13=0x19bc50 r14=0x41220019bc70 r15=0x1
2012-11-16T16:15:35.242Z cpu6:4102)pcpu:0 world:6211 name:"vmm1:Windows_2012_-_SQL" (V)
2012-11-16T16:15:35.242Z cpu6:4102)pcpu:1 world:4109 name:"directMapUnmap" (S)
2012-11-16T16:15:35.242Z cpu6:4102)pcpu:2 world:6255 name:"vmm0:NexentaStor" (V)
2012-11-16T16:15:35.242Z cpu6:4102)pcpu:3 world:6194 name:"vmm0:Windows_2012_-_App" (V)
2012-11-16T16:15:35.242Z cpu6:4102)pcpu:4 world:6207 name:"vmm0:Windows_2012_-_SQL" (V)
2012-11-16T16:15:35.242Z cpu6:4102)pcpu:5 world:5855 name:"vmm0:Windows_2012_-_AD" (V)
2012-11-16T16:15:35.242Z cpu6:4102)pcpu:6 world:4102 name:"idle6" (IS)
2012-11-16T16:15:35.242Z cpu6:4102)pcpu:7 world:4103 name:"idle7" (IS)
2012-11-16T16:15:35.242Z cpu6:4102)@BlueScreen: #PF Exception 14 in world 4102:idle6 IP 0x41800b052105 addr 0x417fd1837b01
2012-11-16T16:15:35.242Z cpu6:4102)Code start: 0x41800b000000 VMK uptime: 2:11:18:25.729
2012-11-16T16:15:35.242Z cpu6:4102)0x41220019bc10:[0x41800b052105]IRQ_DoInterrupt@vmkernel#nover+0x5c stack: 0x0
2012-11-16T16:15:35.243Z cpu6:4102)0x41220019bc50:[0x41800b04bd92]IDT_IntrHandler@vmkernel#nover+0x139 stack: 0x41220019bd68
2012-11-16T16:15:35.243Z cpu6:4102)0x41220019bc60:[0x41800b110064]gate_entry@vmkernel#nover+0x63 stack: 0x4018
2012-11-16T16:15:35.244Z cpu6:4102)0x41220019bd68:[0x41800b2dbd6f]Power_HaltPCPU@vmkernel#nover+0x276 stack: 0x41220019be68
2012-11-16T16:15:35.244Z cpu6:4102)0x41220019be68:[0x41800b1bd114]CpuSchedIdleLoopInt@vmkernel#nover+0x873 stack: 0x41220019be98
2012-11-16T16:15:35.244Z cpu6:4102)0x41220019be78:[0x41800b1c66ae]CpuSched_IdleLoop@vmkernel#nover+0x15 stack: 0x6
2012-11-16T16:15:35.245Z cpu6:4102)0x41220019be98:[0x41800b04f6ce]Init_SlaveIdle@vmkernel#nover+0x49 stack: 0x0
2012-11-16T16:15:35.245Z cpu6:4102)0x41220019bfe8:[0x41800b2e1f86]SMPSlaveIdle@vmkernel#nover+0x31d stack: 0x0
2012-11-16T16:15:35.247Z cpu6:4102)base fs=0x0 gs=0x418041800000 Kgs=0x0
2012-11-16T16:15:35.247Z cpu6:4102)vmkernel 0x0 .data 0x0 .bss 0x0
这与之前的有关吗?还是只是另一个主机不稳定
答案1
VMware 关于调试“无法确认 TLB”紫屏的文章: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1020214
这听起来像是硬件,您可能需要 VMware 技术人员来调查。