带有 Cisco 12G 模块化 Raid 控制器 SAS3516 的新型 CiscoC240M5 的 nagios raid_check 出现故障。
目前,我已修改 check_raid 以跳过卷检查部分,仅检查物理驱动器,但理想情况下,我们希望使用 check_raid 检查两者。有人找到解决这个问题的方法吗?或者我应该放弃 check_raid 并使用 megaclisas-status --nagios 的包装器脚本,接受虚拟驱动器大小数字的(表面?)问题?
板载 14 个磁盘,
- RAID1 中的 2* 480 GB Micron_5300_MTFDDAK480TDS
- 12* 8T TOSHIBA MG06SCA800A 处于 RAID10 中,其中 2 个是热备件。
错误:
CRITICAL: megacli:[Volumes(2): DISK0.0:,DISK1.1:; Devices(14): 00,11=Hotspare 03,09,06,12,05,01,08,10,07,04,13,02=Online]
结果是 megacli 命令缺少数据:
# /usr/sbin/megacli -LdInfo -Lall -aall
Adapter 0 -- Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Virtual Drive: 1 (Target Id: 1)
Exit Code: 0x00
来自其他用户的提示
# /usr/sbin/megacli ShowSummary -Aall
另一个问题是,Toshiba MG06SCA800A 8T 磁盘报告的容量为 931.436 GB。这导致虚拟驱动器的总量也相差甚远,报告的容量为 4T,而不是 36T
[...]
Connector : Port 12 - 15<Internal>: Slot 6
Vendor Id : TOSHIBA
Product Id : MG06SCA800A
State : Online
Disk Type : SAS,Hard Disk Device
Capacity : 931.436 GB
Power State : Active
虚拟磁盘:
Virtual drive : Target Id 1 ,VD name RAID10_12345678
Size : 4.547 TB
State : Optimal
RAID Level : 10
对于 megaclisas-status 也是一样,因为它使用 megacli。磁盘信息正确,但缺少阵列信息:
# megaclisas-status --nagios
RAID OK - Arrays: OK:2 Bad:0 - Disks: OK:14 Bad:0
# megaclisas-status
-- Controller information --
-- ID | H/W Model | RAM | Temp | BBU | Firmware
c0 | Cisco 12G Modular Raid Controller with 2GB cache (max 16 drives) | 2048MB | 67C | Good | FW: 51.10.0-3612
-- Array information --
-- ID | Type | Size | Strpsz | Flags | DskCache | Status | OS Path | CacheCade |InProgress
c0u0 | N/A | | | | N/A | N/A | /dev/sda | None |None
c0u1 | N/A | | | | N/A | N/A | /dev/sdb | None |None
-- Disk information --
-- ID | Type | Drive Model | Size | Status | Speed | Temp | Slot ID | LSI ID
c0u0p0 | SSD | 202529BD123BMicron_5300_MTFDDAK480TDS D3MC000 | 446.1 Gb | Online, Spun Up | 6.0Gb/s | 31C | [134:13] | 13
c0u0p1 | SSD | 202529BCC1C0Micron_5300_MTFDDAK480TDS D3MC000 | 446.1 Gb | Online, Spun Up | 6.0Gb/s | 32C | [134:14] | 2
c0u1p0 | HDD | TOSHIBA MG06SCA800A 5701Z020A0FSFRJG | 7.276 TB | Online, Spun Up | 12.0Gb/s | 26C | [134:1] | 3
c0u1p1 | HDD | TOSHIBA MG06SCA800A 5701Z020A05RFRJG | 7.276 TB | Online, Spun Up | 12.0Gb/s | 26C | [134:2] | 9
c0u1p0 | HDD | TOSHIBA MG06SCA800A 5701Z020A0FRFRJG | 7.276 TB | Online, Spun Up | 12.0Gb/s | 27C | [134:3] | 6
c0u1p1 | HDD | TOSHIBA MG06SCA800A 5701Z020A0E7FRJG | 7.276 TB | Online, Spun Up | 12.0Gb/s | 26C | [134:4] | 12
c0u1p0 | HDD | TOSHIBA MG06SCA800A 5701Z020A05UFRJG | 7.276 TB | Online, Spun Up | 12.0Gb/s | 26C | [134:5] | 5
c0u1p1 | HDD | TOSHIBA MG06SCA800A 5701Z020A052FRJG | 7.276 TB | Online, Spun Up | 12.0Gb/s | 26C | [134:6] | 1
c0u1p0 | HDD | TOSHIBA MG06SCA800A 5701Z020A0EGFRJG | 7.276 TB | Online, Spun Up | 12.0Gb/s | 26C | [134:7] | 8
c0u1p1 | HDD | TOSHIBA MG06SCA800A 5701Z020A0FEFRJG | 7.276 TB | Online, Spun Up | 12.0Gb/s | 26C | [134:8] | 10
c0u1p0 | HDD | TOSHIBA MG06SCA800A 5701Z020A04AFRJG | 7.276 TB | Online, Spun Up | 12.0Gb/s | 26C | [134:9] | 7
c0u1p1 | HDD | TOSHIBA MG06SCA800A 5701Z020A0KGFRJG | 7.276 TB | Online, Spun Up | 12.0Gb/s | 26C | [134:10] | 4
-- Unconfigured Disk information --
-- ID | Type | Drive Model | Size | Status | Speed | Temp | Slot ID | LSI ID | Path
c0uXpY | HDD | TOSHIBA MG06SCA800A 5701Z020A0EFFRJG | 7.276 TB | Hotspare, Spun down | 12.0Gb/s | 26C | [134:11] | 0 | N/A
c0uXpY | HDD | TOSHIBA MG06SCA800A 5701Z020A0DRFRJG | 7.276 TB | Hotspare, Spun down | 12.0Gb/s | 26C | [134:12] | 11 | N/A
storcli 实际上显示了正确的数字,但它不在 hwraid.le-vert.net repo 中,因此它是禁止访问的