我们在 Supermicro 服务器中有以下设置:
- LSI 9400 -> 扩展器 -> 10 x HDD
- LSI 9500 -> 扩展器 -> 2 x NVMe
|------------| |-----------|
| LSI 9400 | |--------------| ----->| HDD x 10 |
|------------| ---->| Expander | |-----------|
| |
|------------| ---->| | |-----------|
| LSI 9500 | |--------------| ----->| NVMe Intel|
|------------| | |-----------|
|
| |-----------|
|-->| NVMe Intel|
|-----------|
我们可以毫无问题地闪烁任何装有 HDD 的托架,但闪烁 NVMe 托架则没有任何反应。
我想实现以下两种解决方案中的任何一种:
- 最佳解决方案 - 闪烁包含在 9500 三模式上运行的 NVMe 控制器的托架
- 替代解决方案 - 找到一个链接/值/信息,使我能够将 NVMe 与 LSI 9500 控制器上的物理端口关联起来。我正在考虑类似“查看文件 /<some_path>/<some_file>,您将在那里找到端口的 ID”之类的事情。也欢迎更复杂的关联。如果我们必须关联多个值,那就没问题了。
操作系统:Rocky Linux,完全由我们控制,我们可以在上面做任何事情,没有任何限制。服务器配置:它运行 ESXi,两个控制器都直通 Rocky Linux VM。
到目前为止我做了以下调查和实验。
- 尝试闪烁
ledctl
-> 无错误,无闪烁 - 尝试闪烁
sg_ses
-> 没有错误,没有闪烁。这里有一些命令,经过修剪以消除磁盘的其余部分。
基本上,我想知道的是:如果驱动器发生故障,应该移除哪一个?答案可能是 LED 闪烁或运行命令,显示“顶部驱动器”或类似内容。
[root@echo-development ~]# lsscsi -g
[1:0:0:0] enclosu BROADCOM VirtualSES 03 - /dev/sg2
[1:2:0:0] disk NVMe INTEL SSDPE2KX01 01B1 /dev/sdb /dev/sg3
[1:2:1:0] disk NVMe INTEL SSDPE2KX02 0131 /dev/sdc /dev/sg4
[root@echo-development ~]# sg_ses -vvv --dsn=0 --set=ident /dev/sg2
open /dev/sg2 with flags=0x802
request sense cmd: 03 00 00 00 fc 00
duration=0 ms
request sense: pass-through requested 252 bytes (data-in) but got 18 bytes
Request Sense near startup detected something:
Sense key: No Sense, additional: Additional sense: No additional sense information
... continue
Receive diagnostic results command for Configuration (SES) dpage
Receive diagnostic results cdb: 1c 01 01 ff fc 00
duration=0 ms
Receive diagnostic results: pass-through requested 65532 bytes (data-in) but got 60 bytes
Receive diagnostic results: response:
01 00 00 38 00 00 00 00 11 00 02 24 30 01 62 b2
07 eb 55 80 42 52 4f 41 44 43 4f 4d 56 69 72 74
75 61 6c 53 45 53 00 00 00 00 00 00 30 33 00 00
17 28 00 00 19 08 00 00 00 00 00 00
Receive diagnostic results command for Enclosure Status (SES) dpage
Receive diagnostic results cdb: 1c 01 02 ff fc 00
duration=0 ms
Receive diagnostic results: pass-through requested 65532 bytes (data-in) but got 208 bytes
Receive diagnostic results: response:
02 00 00 cc 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00
01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00
01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00
Receive diagnostic results command for Element Descriptor (SES) dpage
Receive diagnostic results cdb: 1c 01 07 ff fc 00
duration=0 ms
Receive diagnostic results: pass-through requested 65532 bytes (data-in) but got 432 bytes
Receive diagnostic results: response, first 256 bytes:
07 00 01 ac 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 1c 43 30 2e 30 00 00 00 00 00 00 00 00
00 00 00 00 4e 4f 42 50 4d 47 4d 54 00 00 00 00
00 00 00 1c 43 30 2e 30 00 00 00 00 00 00 00 00
00 00 00 00 4e 4f 42 50 4d 47 4d 54 00 00 00 00
00 00 00 1c 43 30 2e 30 00 00 00 00 00 00 00 00
Receive diagnostic results command for Additional Element Status (SES-2) dpage
Receive diagnostic results cdb: 1c 01 0a ff fc 00
duration=0 ms
Receive diagnostic results: pass-through requested 65532 bytes (data-in) but got 1448 bytes
Receive diagnostic results: response, first 256 bytes:
0a 00 05 a4 00 00 00 00 16 22 00 00 01 00 00 04
10 00 00 08 50 00 62 b2 07 eb 55 80 3c d2 e4 a6
23 29 01 00 00 00 00 00 00 00 00 00 96 22 00 01
01 00 00 ff 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
96 22 00 02 01 00 00 ff 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 96 22 00 03 01 00 00 ff 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 16 22 00 04 01 00 00 06
10 00 00 08 50 00 62 b2 07 eb 55 84 3c d2 e4 99
70 1d 01 00 00 00 00 00 00 00 00 00 96 22 00 05
01 00 00 ff 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
96 22 00 06 01 00 00 ff 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
s_byte=2, s_bit=1, n_bits=1
Applying mask to element status [etc=23] prior to modify then write
Send diagnostic command page name: Enclosure Control (SES)
Send diagnostic cdb: 1d 10 00 00 d0 00
Send diagnostic parameter list:
02 00 00 cc 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 80 00 02 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00
01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00
01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00
Send diagnostic timeout: 60 seconds
duration=0 ms
[root@echo-development ~]# sg_ses -vvv --dsn=6 --set=ident /dev/sg2
open /dev/sg2 with flags=0x802
request sense cmd: 03 00 00 00 fc 00
duration=0 ms
request sense: pass-through requested 252 bytes (data-in) but got 18 bytes
Request Sense near startup detected something:
Sense key: No Sense, additional: Additional sense: No additional sense information
... continue
Receive diagnostic results command for Configuration (SES) dpage
Receive diagnostic results cdb: 1c 01 01 ff fc 00
duration=0 ms
Receive diagnostic results: pass-through requested 65532 bytes (data-in) but got 60 bytes
Receive diagnostic results: response:
01 00 00 38 00 00 00 00 11 00 02 24 30 01 62 b2
07 eb 55 80 42 52 4f 41 44 43 4f 4d 56 69 72 74
75 61 6c 53 45 53 00 00 00 00 00 00 30 33 00 00
17 28 00 00 19 08 00 00 00 00 00 00
Receive diagnostic results command for Enclosure Status (SES) dpage
Receive diagnostic results cdb: 1c 01 02 ff fc 00
duration=0 ms
Receive diagnostic results: pass-through requested 65532 bytes (data-in) but got 208 bytes
Receive diagnostic results: response:
02 00 00 cc 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00
01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00
01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00
Receive diagnostic results command for Element Descriptor (SES) dpage
Receive diagnostic results cdb: 1c 01 07 ff fc 00
duration=0 ms
Receive diagnostic results: pass-through requested 65532 bytes (data-in) but got 432 bytes
Receive diagnostic results: response, first 256 bytes:
07 00 01 ac 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 1c 43 30 2e 30 00 00 00 00 00 00 00 00
00 00 00 00 4e 4f 42 50 4d 47 4d 54 00 00 00 00
00 00 00 1c 43 30 2e 30 00 00 00 00 00 00 00 00
00 00 00 00 4e 4f 42 50 4d 47 4d 54 00 00 00 00
00 00 00 1c 43 30 2e 30 00 00 00 00 00 00 00 00
Receive diagnostic results command for Additional Element Status (SES-2) dpage
Receive diagnostic results cdb: 1c 01 0a ff fc 00
duration=0 ms
Receive diagnostic results: pass-through requested 65532 bytes (data-in) but got 1448 bytes
Receive diagnostic results: response, first 256 bytes:
0a 00 05 a4 00 00 00 00 16 22 00 00 01 00 00 04
10 00 00 08 50 00 62 b2 07 eb 55 80 3c d2 e4 a6
23 29 01 00 00 00 00 00 00 00 00 00 96 22 00 01
01 00 00 ff 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
96 22 00 02 01 00 00 ff 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 96 22 00 03 01 00 00 ff 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 16 22 00 04 01 00 00 06
10 00 00 08 50 00 62 b2 07 eb 55 84 3c d2 e4 99
70 1d 01 00 00 00 00 00 00 00 00 00 96 22 00 05
01 00 00 ff 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
96 22 00 06 01 00 00 ff 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
s_byte=2, s_bit=1, n_bits=1
Applying mask to element status [etc=23] prior to modify then write
Send diagnostic command page name: Enclosure Control (SES)
Send diagnostic cdb: 1d 10 00 00 d0 00
Send diagnostic parameter list:
02 00 00 cc 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 80 00 02 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00
01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00
01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00
Send diagnostic timeout: 60 seconds
duration=0 ms
我们使用dns=0
和dns=6
因为看起来终端设备连接到这两个端口(输出修剪到相关结果):
[root@echo-development ~]# sg_ses -j /dev/sg2
BROADCOM VirtualSES 03
Primary enclosure logical identifier (hex): 300162b207eb5580
[0,-1] Element type: Array device slot
Enclosure Status:
Predicted failure=0, Disabled=0, Swap=0, status: Unsupported
OK=0, Reserved device=0, Hot spare=0, Cons check=0
In crit array=0, In failed array=0, Rebuild/remap=0, R/R abort=0
App client bypass A=0, Do not remove=0, Enc bypass A=0, Enc bypass B=0
Ready to insert=0, RMV=0, Ident=0, Report=0
App client bypass B=0, Fault sensed=0, Fault reqstd=0, Device off=0
Bypassed A=0, Bypassed B=0, Dev bypassed A=0, Dev bypassed B=0
[0,0] Element type: Array device slot
Enclosure Status:
Predicted failure=0, Disabled=0, Swap=1, status: Unsupported
OK=0, Reserved device=0, Hot spare=0, Cons check=0
In crit array=0, In failed array=0, Rebuild/remap=0, R/R abort=0
App client bypass A=0, Do not remove=0, Enc bypass A=0, Enc bypass B=0
Ready to insert=0, RMV=0, Ident=0, Report=0
App client bypass B=0, Fault sensed=0, Fault reqstd=0, Device off=0
Bypassed A=0, Bypassed B=0, Dev bypassed A=0, Dev bypassed B=0
Additional Element Status:
Transport protocol: SAS
number of phys: 1, not all phys: 0, device slot number: 4
phy index: 0
SAS device type: end device
initiator port for:
target port for: SSP
attached SAS address: 0x500062b207eb5580
SAS address: 0x3cd2e4dd23290100
phy identifier: 0x0
[0,4] Element type: Array device slot
Enclosure Status:
Predicted failure=0, Disabled=0, Swap=1, status: Unsupported
OK=0, Reserved device=0, Hot spare=0, Cons check=0
In crit array=0, In failed array=0, Rebuild/remap=0, R/R abort=0
App client bypass A=0, Do not remove=0, Enc bypass A=0, Enc bypass B=0
Ready to insert=0, RMV=0, Ident=0, Report=0
App client bypass B=0, Fault sensed=0, Fault reqstd=0, Device off=0
Bypassed A=0, Bypassed B=0, Dev bypassed A=0, Dev bypassed B=0
Additional Element Status:
Transport protocol: SAS
number of phys: 1, not all phys: 0, device slot number: 6
phy index: 0
SAS device type: end device
initiator port for:
target port for: SSP
attached SAS address: 0x500062b207eb5584
SAS address: 0x3cd2e4a623290100
phy identifier: 0x0
SAS address
在驱动器列表中的上述输出中找到。SAS address: 0x3cd2e4a623290100
应该在驱动器(NVMe、SSD、HDD 等)上找到。至少我从sg_ses
互联网上的文档和博客文章/论坛中了解到。但 NVMe 上的 SAS 地址不同,并且在任何设备上都找不到控制器指示的 SAS 地址。
[root@echo-development ~]# cat "/sys/bus/pci/devices/0000:04:00.0/host1/target1:2:1/1:2:1:0/sas_address"
0x00012923a6e4d25c
- 依赖 HCTL -> 不起作用,因为当我将驱动器移除/重新插入托架后,HCTL 会发生变化。它还会在重新启动时重置为 1:2:0:0 和 1:2:1:0。
- 与控制器上的端口关联
/sys/bus/pci/devices/0000:04:00.0/host1/target1:2:1/1:2:1:0/sas_device_handle
。-> 不起作用,每次移除并重新插入设备时它都会增加。 - 尝试查找 NVMe 驱动器和控制器端口之间的任何其他关联。-> 我找不到。
如果您需要任何其他信息或者我还可以尝试其他方法,请告诉我。
答案1
Broadcom 的实用程序storcli
现在可以与 IT/HBA 模式下的控制器配合使用,并且可以识别设备与物理端口之间的关系。