我有一个有趣的问题,希望大家能帮助我。我有一台 IBM 5014(相当于 LSI 9260-8i)运行两个 RAID10 虚拟驱动器。第一个是 4 个 WD RE4,每个 2TB,总驱动器容量为 4TB - 我们称之为 VD1。另一个是 4 个 WD RE4-GP,每个 2TB,总驱动器容量为 4TB - 我们称之为 VD0。以防万一,该卡在 Norco 机箱中运行,配有 3 个风扇(每组 4 个驱动器上 1 个 + Gigabyte MB、16GB RAM 和 IBM 卡上 1 个。还有一台 IBM5015,也在 RAID10 中运行 4 个 256GB SSD)。我使用 ESXi5.5 和一系列 VM 进行虚拟化。 5014 卡以直通模式运行至 WHS2011 主机,而 5015 包含 VM 本身。
VD0 运行良好,没有任何问题。这是我的主要文档存储。
但是,包含我所有视频的 VD1 会定期丢弃一个驱动器,导致其状态降低,然后几乎立即(通常具有完全相同的时间戳,但有时会延迟 1 秒)丢弃剩余的驱动器,导致其离线。
控制器本身已经正常运行了近 6 个月,因此虽然它可能与控制器有关,但感觉它会导致两个虚拟驱动器都出现问题,而不仅仅是其中一个。
我面临的挑战是驱动器不会以相同的顺序持续掉线(至少根据日志)——所以我不知道哪个驱动器导致了这个问题。我在下面附上了日志的摘录。正如你所看到的,它先是删除了驱动器,然后又重新添加了它们。
任何关于如何排除哪个驱动器故障的建议都非常受欢迎 - 我不敢相信它们都一起坏了,也不敢相信 MSM 日志本身包含的信息如此之少。
谢谢大家!
道格
ID = 248
SEQUENCE NUMBER = 382617
TIME = 07-07-2015 08:14:46
LOCALIZED MESSAGE = Controller ID: 0 Device removed Device Type: Disk Device Id: 8
ID = 112
SEQUENCE NUMBER = 382616
TIME = 07-07-2015 08:14:46
LOCALIZED MESSAGE = Controller ID: 0 PD removed: -:-:1
ID = 248
SEQUENCE NUMBER = 382615
TIME = 07-07-2015 08:14:45
LOCALIZED MESSAGE = Controller ID: 0 Device removed Device Type: Disk Device Id: 13
ID = 112
SEQUENCE NUMBER = 382614
TIME = 07-07-2015 08:14:45
LOCALIZED MESSAGE = Controller ID: 0 PD removed: -:-:3
ID = 248
SEQUENCE NUMBER = 382613
TIME = 07-07-2015 08:14:44
LOCALIZED MESSAGE = Controller ID: 0 Device removed Device Type: Disk Device Id: 9
ID = 112
SEQUENCE NUMBER = 382612
TIME = 07-07-2015 08:14:44
LOCALIZED MESSAGE = Controller ID: 0 PD removed: -:-:0
ID = 248
SEQUENCE NUMBER = 382611
TIME = 07-07-2015 08:14:44
LOCALIZED MESSAGE = Controller ID: 0 Device removed Device Type: Disk Device Id: 14
ID = 112
SEQUENCE NUMBER = 382610
TIME = 07-07-2015 08:14:44
LOCALIZED MESSAGE = Controller ID: 0 PD removed: -:-:2
ID = 247
SEQUENCE NUMBER = 382609
TIME = 07-07-2015 07:53:09
LOCALIZED MESSAGE = Controller ID: 0 Device inserted Device Type: Disk Device Id: 14
ID = 91
SEQUENCE NUMBER = 382608
TIME = 07-07-2015 07:53:09
LOCALIZED MESSAGE = Controller ID: 0 PD inserted: -:-:2
ID = 247
SEQUENCE NUMBER = 382607
TIME = 07-07-2015 07:53:09
LOCALIZED MESSAGE = Controller ID: 0 Device inserted Device Type: Disk Device Id: 9
ID = 91
SEQUENCE NUMBER = 382606
TIME = 07-07-2015 07:53:09
LOCALIZED MESSAGE = Controller ID: 0 PD inserted: -:-:0
ID = 247
SEQUENCE NUMBER = 382605
TIME = 07-07-2015 07:53:09
LOCALIZED MESSAGE = Controller ID: 0 Device inserted Device Type: Disk Device Id: 8
ID = 91
SEQUENCE NUMBER = 382604
TIME = 07-07-2015 07:53:09
LOCALIZED MESSAGE = Controller ID: 0 PD inserted: -:-:1
ID = 247
SEQUENCE NUMBER = 382603
TIME = 07-07-2015 07:53:04
LOCALIZED MESSAGE = Controller ID: 0 Device inserted Device Type: Disk Device Id: 13
ID = 91
SEQUENCE NUMBER = 382602
TIME = 07-07-2015 07:53:04
LOCALIZED MESSAGE = Controller ID: 0 PD inserted: -:-:3
ID = 248
SEQUENCE NUMBER = 382601
TIME = 07-07-2015 07:52:44
LOCALIZED MESSAGE = Controller ID: 0 Device removed Device Type: Disk Device Id: 9
ID = 112
SEQUENCE NUMBER = 382600
TIME = 07-07-2015 07:52:44
LOCALIZED MESSAGE = Controller ID: 0 PD removed: -:-:0
ID = 248
SEQUENCE NUMBER = 382599
TIME = 07-07-2015 07:52:42
LOCALIZED MESSAGE = Controller ID: 0 Device removed Device Type: Disk Device Id: 13
ID = 112
SEQUENCE NUMBER = 382598
TIME = 07-07-2015 07:52:42
LOCALIZED MESSAGE = Controller ID: 0 PD removed: -:-:3
ID = 248
SEQUENCE NUMBER = 382597
TIME = 07-07-2015 07:52:41
LOCALIZED MESSAGE = Controller ID: 0 Device removed Device Type: Disk Device Id: 8
ID = 112
SEQUENCE NUMBER = 382596
TIME = 07-07-2015 07:52:41
LOCALIZED MESSAGE = Controller ID: 0 PD removed: -:-:1
ID = 248
SEQUENCE NUMBER = 382595
TIME = 07-07-2015 07:52:40
LOCALIZED MESSAGE = Controller ID: 0 Device removed Device Type: Disk Device Id: 14
ID = 112
SEQUENCE NUMBER = 382594
TIME = 07-07-2015 07:52:40
LOCALIZED MESSAGE = Controller ID: 0 PD removed: -:-:2
ID = 145
SEQUENCE NUMBER = 382593
TIME = 07-07-2015 07:10:59
LOCALIZED MESSAGE = Controller ID: 0 Battery temperature is high
ID = 149
SEQUENCE NUMBER = 382592
TIME = 07-07-2015 06:56:54
LOCALIZED MESSAGE = Controller ID: 0 Battery temperature is normal
ID = 247
SEQUENCE NUMBER = 382591
TIME = 07-07-2015 04:08:56
LOCALIZED MESSAGE = Controller ID: 0 Device inserted Device Type: Disk Device Id: 14
ID = 91
SEQUENCE NUMBER = 382590
TIME = 07-07-2015 04:08:56
LOCALIZED MESSAGE = Controller ID: 0 PD inserted: -:-:2
ID = 247
SEQUENCE NUMBER = 382589
TIME = 07-07-2015 04:08:56
LOCALIZED MESSAGE = Controller ID: 0 Device inserted Device Type: Disk Device Id: 9
ID = 91
SEQUENCE NUMBER = 382588
TIME = 07-07-2015 04:08:56
LOCALIZED MESSAGE = Controller ID: 0 PD inserted: -:-:0
ID = 247
SEQUENCE NUMBER = 382587
TIME = 07-07-2015 04:08:55
LOCALIZED MESSAGE = Controller ID: 0 Device inserted Device Type: Disk Device Id: 8
ID = 91
SEQUENCE NUMBER = 382586
TIME = 07-07-2015 04:08:55
LOCALIZED MESSAGE = Controller ID: 0 PD inserted: -:-:1
ID = 248
SEQUENCE NUMBER = 382585
TIME = 07-07-2015 04:08:49
LOCALIZED MESSAGE = Controller ID: 0 Device removed Device Type: Disk Device Id: 8
ID = 112
SEQUENCE NUMBER = 382584
TIME = 07-07-2015 04:08:49
LOCALIZED MESSAGE = Controller ID: 0 PD removed: -:-:1
ID = 248
SEQUENCE NUMBER = 382583
TIME = 07-07-2015 04:08:47
LOCALIZED MESSAGE = Controller ID: 0 Device removed Device Type: Disk Device Id: 9
ID = 112
SEQUENCE NUMBER = 382582
TIME = 07-07-2015 04:08:47
LOCALIZED MESSAGE = Controller ID: 0 PD removed: -:-:0
ID = 248
SEQUENCE NUMBER = 382581
TIME = 07-07-2015 04:08:47
LOCALIZED MESSAGE = Controller ID: 0 Device removed Device Type: Disk Device Id: 14
ID = 112
SEQUENCE NUMBER = 382580
TIME = 07-07-2015 04:08:47
LOCALIZED MESSAGE = Controller ID: 0 PD removed: -:-:2
ID = 247
SEQUENCE NUMBER = 382579
TIME = 07-07-2015 03:24:32
LOCALIZED MESSAGE = Controller ID: 0 Device inserted Device Type: Disk Device Id: 14
ID = 91
SEQUENCE NUMBER = 382578
TIME = 07-07-2015 03:24:32
LOCALIZED MESSAGE = Controller ID: 0 PD inserted: -:-:2
ID = 247
SEQUENCE NUMBER = 382577
TIME = 07-07-2015 03:24:32
LOCALIZED MESSAGE = Controller ID: 0 Device inserted Device Type: Disk Device Id: 13
ID = 91
SEQUENCE NUMBER = 382576
TIME = 07-07-2015 03:24:32
LOCALIZED MESSAGE = Controller ID: 0 PD inserted: -:-:3
ID = 247
SEQUENCE NUMBER = 382575
TIME = 07-07-2015 03:24:32
LOCALIZED MESSAGE = Controller ID: 0 Device inserted Device Type: Disk Device Id: 8
ID = 91
SEQUENCE NUMBER = 382574
TIME = 07-07-2015 03:24:32
LOCALIZED MESSAGE = Controller ID: 0 PD inserted: -:-:1
ID = 247
SEQUENCE NUMBER = 382573
TIME = 07-07-2015 03:24:27
LOCALIZED MESSAGE = Controller ID: 0 Device inserted Device Type: Disk Device Id: 9
ID = 91
SEQUENCE NUMBER = 382572
TIME = 07-07-2015 03:24:27
LOCALIZED MESSAGE = Controller ID: 0 PD inserted: -:-:0
ID = 248
SEQUENCE NUMBER = 382571
TIME = 07-07-2015 03:23:36
LOCALIZED MESSAGE = Controller ID: 0 Device removed Device Type: Disk Device Id: 9
ID = 112
SEQUENCE NUMBER = 382570
TIME = 07-07-2015 03:23:36
LOCALIZED MESSAGE = Controller ID: 0 PD removed: -:-:0
ID = 248
SEQUENCE NUMBER = 382569
TIME = 07-07-2015 03:23:36
LOCALIZED MESSAGE = Controller ID: 0 Device removed Device Type: Disk Device Id: 14
ID = 112
SEQUENCE NUMBER = 382568
TIME = 07-07-2015 03:23:36
LOCALIZED MESSAGE = Controller ID: 0 PD removed: -:-:2
ID = 248
SEQUENCE NUMBER = 382567
TIME = 07-07-2015 03:23:36
LOCALIZED MESSAGE = Controller ID: 0 Device removed Device Type: Disk Device Id: 8
ID = 112
SEQUENCE NUMBER = 382566
TIME = 07-07-2015 03:23:36
LOCALIZED MESSAGE = Controller ID: 0 PD removed: -:-:1
ID = 248
SEQUENCE NUMBER = 382565
TIME = 07-07-2015 03:23:36
LOCALIZED MESSAGE = Controller ID: 0 Device removed Device Type: Disk Device Id: 13
ID = 112
SEQUENCE NUMBER = 382564
TIME = 07-07-2015 03:23:36
LOCALIZED MESSAGE = Controller ID: 0 PD removed: -:-:3
ID = 139
SEQUENCE NUMBER = 382435
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID: 0 Deleted VD: 1
ID = 114
SEQUENCE NUMBER = 382434
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID: 0 State change: PD = -:-:0 Previous = Failed Current = Unconfigured Bad
ID = 114
SEQUENCE NUMBER = 382433
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID: 0 State change: PD = -:-:2 Previous = Failed Current = Unconfigured Bad
ID = 114
SEQUENCE NUMBER = 382432
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID: 0 State change: PD = -:-:1 Previous = Failed Current = Unconfigured Bad
ID = 114
SEQUENCE NUMBER = 382431
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID: 0 State change: PD = -:-:3 Previous = Failed Current = Unconfigured Bad
ID = 114
SEQUENCE NUMBER = 382430
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID: 0 State change: PD = -:-:0 Previous = Online Current = Failed
ID = 248
SEQUENCE NUMBER = 382429
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID: 0 Device removed Device Type: Disk Device Id: 9
ID = 112
SEQUENCE NUMBER = 382428
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID: 0 PD removed: -:-:0
ID = 252
SEQUENCE NUMBER = 382427
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID: 0 VD is now OFFLINE VD 1
ID = 81
SEQUENCE NUMBER = 382426
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID: 0 State change on VD: 1 Previous = Degraded Current = Offline
ID = 114
SEQUENCE NUMBER = 382425
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID: 0 State change: PD = -:-:2 Previous = Online Current = Failed
ID = 248
SEQUENCE NUMBER = 382424
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID: 0 Device removed Device Type: Disk Device Id: 14
ID = 112
SEQUENCE NUMBER = 382423
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID: 0 PD removed: -:-:2
ID = 114
SEQUENCE NUMBER = 382422
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID: 0 State change: PD = -:-:1 Previous = Online Current = Failed
ID = 248
SEQUENCE NUMBER = 382421
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID: 0 Device removed Device Type: Disk Device Id: 8
ID = 112
SEQUENCE NUMBER = 382420
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID: 0 PD removed: -:-:1
ID = 251
SEQUENCE NUMBER = 382419
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID: 0 VD is now DEGRADED VD 1
ID = 81
SEQUENCE NUMBER = 382418
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID: 0 State change on VD: 1 Previous = Optimal Current = Degraded
ID = 114
SEQUENCE NUMBER = 382417
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID: 0 State change: PD = -:-:3 Previous = Online Current = Failed
ID = 248
SEQUENCE NUMBER = 382416
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID: 0 Device removed Device Type: Disk Device Id: 13
ID = 112
SEQUENCE NUMBER = 382415
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID: 0 PD removed: -:-:3
答案1
抱歉,我没有遇到过同样的情况,但我们使用的是 LSI,并且之前有过固件更新,因此问题已经解决了。请检查您是否拥有设备的最新固件。