我使用的是 SuperMicro X9SRi-3F 主板,该主板采用 Intel C606 芯片组,并具有带 SAS 功能的双 4 端口 SCU。我使用的硬盘是 4 个 WD Re (WD6001F9YZ) 和 2 个 WD Gold (WD6002FRYZ)。全部容量均为 6TB。
WD6001F9YZ 连接到端口 0-3,WD6002FRYZ 连接到端口 4 和 5(应注意,我最终将在端口 6 和 7 上添加两个 WD6001F9YZ)。
操作系统安装在连接到主板上普通 SATA 2.0 端口之一的 1TB 驱动器上。
CentOS 7 可以识别所有驱动器,lsblk
输出如下:
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 5.5T 0 disk
sdb 8:16 0 5.5T 0 disk
sdc 8:32 0 5.5T 0 disk
sdd 8:48 0 5.5T 0 disk
sde 8:64 0 5.5T 0 disk
sdf 8:80 0 5.5T 0 disk
它们也都出现在/dev/disk/by-id
:
lrwxrwxrwx. 1 root root 9 Oct 17 14:06 ata-WDC_WD6001F9YZ-09YUWL1_WD-WX41DA58PYRS -> ../../sda
lrwxrwxrwx. 1 root root 9 Oct 17 14:06 ata-WDC_WD6001F9YZ-09YUWL1_WD-WX41DA5LV7KR -> ../../sdb
lrwxrwxrwx. 1 root root 9 Oct 17 14:06 ata-WDC_WD6001F9YZ-09YUWL1_WD-WX41DA5LV87E -> ../../sdc
lrwxrwxrwx. 1 root root 9 Oct 17 14:06 ata-WDC_WD6001F9YZ-09YUWL1_WD-WX41DA5LVL4J -> ../../sdd
lrwxrwxrwx. 1 root root 9 Oct 17 14:06 ata-WDC_WD6002FRYZ-01WD5B0_K1HNL2KD -> ../../sde
lrwxrwxrwx. 1 root root 9 Oct 17 14:06 ata-WDC_WD6002FRYZ-01WD5B0_K1JMY3ND -> ../../sdf
但是,只有驱动器 cf 出现在/dev/disk/by-path
:
lrwxrwxrwx. 1 root root 9 Oct 17 13:42 pci-0000:03:00.0-sas-0x500304801349fe00-lun-0 -> ../../sde
lrwxrwxrwx. 1 root root 9 Oct 17 13:42 pci-0000:03:00.0-sas-0x500304801349fe01-lun-0 -> ../../sdf
lrwxrwxrwx. 1 root root 9 Oct 17 13:42 pci-0000:03:00.0-sas-0x500304801349fe02-lun-0 -> ../../sdc
lrwxrwxrwx. 1 root root 9 Oct 17 13:42 pci-0000:03:00.0-sas-0x500304801349fe03-lun-0 -> ../../sdd
lrwxrwxrwx. 1 root root 9 Oct 17 13:42 pci-0000:03:00.0-sas-phy0-lun-0 -> ../../sde
lrwxrwxrwx. 1 root root 9 Oct 17 13:42 pci-0000:03:00.0-sas-phy1-lun-0 -> ../../sdf
lrwxrwxrwx. 1 root root 9 Oct 17 13:42 pci-0000:03:00.0-sas-phy2-lun-0 -> ../../sdc
lrwxrwxrwx. 1 root root 9 Oct 17 13:42 pci-0000:03:00.0-sas-phy3-lun-0 -> ../../sdd
在MB的BIOS中,当启用SCU单元并启用“SCU RAID Option ROM/UEFI驱动程序”时,所有驱动器都会被识别:
当启用 SCU 单元但禁用选项 ROM 时,驱动器不会显示在 BIOS 中:
两种设置(选项 ROM 禁用/启用)都会导致相同的问题。当然,禁用 SCU 意味着它根本不会出现在操作系统中(从 lspci 的输出中消失)
的输出lshw -c storage
看起来像这样:
*-sas
description: Serial Attached SCSI controller
product: C606 chipset Dual 4-Port SATA/SAS Storage Control Unit
vendor: Intel Corporation
physical id: 0
bus info: pci@0000:03:00.0
logical name: scsi6
logical name: scsi7
version: 06
width: 64 bits
clock: 33MHz
capabilities: sas pm pciexpress msix bus_master cap_list
configuration: driver=isci latency=0
resources: irq:49 memory:fa8f8000-fa8fffff memory:fa000000-fa7fffff ioport:e100(size=256) ioport:e000(size=256) memory:fa800000-fa8f7fff
这对我来说是一个问题,因为我打算设置一个 8 磁盘 ZFS 阵列,并希望引用系统中驱动器的物理位置(插槽 1、插槽 2 等)。
我的直觉告诉我这是 C606 的内核驱动程序的问题,但老实说我不知道。
[编辑]
如果我热插拔打算在端口 6 和 7 上使用的其他 SATA 驱动器,/dev/disk/by-path
则如下所示:
lrwxrwxrwx. 1 root root 9 Oct 17 14:20 pci-0000:03:00.0-sas-0x5fcfffff00000001-lun-0 -> ../../sda
lrwxrwxrwx. 1 root root 9 Oct 17 14:20 pci-0000:03:00.0-sas-0x5fcfffff00000002-lun-0 -> ../../sde
lrwxrwxrwx. 1 root root 9 Oct 17 14:20 pci-0000:03:00.0-sas-0x5fcfffff00000003-lun-0 -> ../../sdc
lrwxrwxrwx. 1 root root 9 Oct 17 14:24 pci-0000:03:00.0-sas-0x5fcfffff00000004-lun-0 -> ../../sdi
lrwxrwxrwx. 1 root root 10 Oct 17 14:24 pci-0000:03:00.0-sas-0x5fcfffff00000004-lun-0-part1 -> ../../sdi1
lrwxrwxrwx. 1 root root 10 Oct 17 14:24 pci-0000:03:00.0-sas-0x5fcfffff00000004-lun-0-part2 -> ../../sdi2
lrwxrwxrwx. 1 root root 9 Oct 17 14:24 pci-0000:03:00.0-sas-0x5fcfffff00000005-lun-0 -> ../../sdj
lrwxrwxrwx. 1 root root 10 Oct 17 14:24 pci-0000:03:00.0-sas-0x5fcfffff00000005-lun-0-part1 -> ../../sdj1
lrwxrwxrwx. 1 root root 10 Oct 17 14:24 pci-0000:03:00.0-sas-0x5fcfffff00000005-lun-0-part2 -> ../../sdj2
lrwxrwxrwx. 1 root root 9 Oct 17 14:20 pci-0000:03:00.0-sas-phy0-lun-0 -> ../../sde
lrwxrwxrwx. 1 root root 9 Oct 17 14:20 pci-0000:03:00.0-sas-phy1-lun-0 -> ../../sdb
lrwxrwxrwx. 1 root root 9 Oct 17 14:24 pci-0000:03:00.0-sas-phy2-lun-0 -> ../../sdi
lrwxrwxrwx. 1 root root 10 Oct 17 14:24 pci-0000:03:00.0-sas-phy2-lun-0-part1 -> ../../sdi1
lrwxrwxrwx. 1 root root 10 Oct 17 14:24 pci-0000:03:00.0-sas-phy2-lun-0-part2 -> ../../sdi2
lrwxrwxrwx. 1 root root 9 Oct 17 14:24 pci-0000:03:00.0-sas-phy3-lun-0 -> ../../sdj
lrwxrwxrwx. 1 root root 10 Oct 17 14:24 pci-0000:03:00.0-sas-phy3-lun-0-part1 -> ../../sdj1
lrwxrwxrwx. 1 root root 10 Oct 17 14:24 pci-0000:03:00.0-sas-phy3-lun-0-part2 -> ../../sdj2
之前以 开头的路径部分sas-0x500304801349fe##
已更改为,sas-0x5fcfffff000000##
因为我已在 BIOS 中禁用了选项 ROM。但有趣的是,sdd
和sdf
已经从目录中消失了by-path
,但是sda
、sdi
、 和sdj
却出现了。此外,顺序也发生了变化。
当然,所有八个驱动器都显示在by-id
目录中:
lrwxrwxrwx. 1 root root 9 Oct 17 14:20 ata-WDC_WD6001F9YZ-09YUWL1_WD-WX41DA58PYRS -> ../../sda
lrwxrwxrwx. 1 root root 9 Oct 17 14:20 ata-WDC_WD6001F9YZ-09YUWL1_WD-WX41DA5LV7KR -> ../../sdb
lrwxrwxrwx. 1 root root 9 Oct 17 14:20 ata-WDC_WD6001F9YZ-09YUWL1_WD-WX41DA5LV87E -> ../../sdc
lrwxrwxrwx. 1 root root 9 Oct 17 14:24 ata-WDC_WD6001F9YZ-09YUWL1_WD-WX41DA5LVJK2 -> ../../sdj
lrwxrwxrwx. 1 root root 10 Oct 17 14:24 ata-WDC_WD6001F9YZ-09YUWL1_WD-WX41DA5LVJK2-part1 -> ../../sdj1
lrwxrwxrwx. 1 root root 10 Oct 17 14:24 ata-WDC_WD6001F9YZ-09YUWL1_WD-WX41DA5LVJK2-part2 -> ../../sdj2
lrwxrwxrwx. 1 root root 9 Oct 17 14:20 ata-WDC_WD6001F9YZ-09YUWL1_WD-WX41DA5LVL4J -> ../../sdd
lrwxrwxrwx. 1 root root 9 Oct 17 14:24 ata-WDC_WD6001F9YZ-09YUWL1_WD-WX41DA5LVNXX -> ../../sdi
lrwxrwxrwx. 1 root root 10 Oct 17 14:24 ata-WDC_WD6001F9YZ-09YUWL1_WD-WX41DA5LVNXX-part1 -> ../../sdi1
lrwxrwxrwx. 1 root root 10 Oct 17 14:24 ata-WDC_WD6001F9YZ-09YUWL1_WD-WX41DA5LVNXX-part2 -> ../../sdi2
lrwxrwxrwx. 1 root root 9 Oct 17 14:20 ata-WDC_WD6002FRYZ-01WD5B0_K1HNL2KD -> ../../sde
lrwxrwxrwx. 1 root root 9 Oct 17 14:20 ata-WDC_WD6002FRYZ-01WD5B0_K1JMY3ND -> ../../sdf
[编辑2]
现在目录by-path
似乎随着每次启动而改变。目前看起来是这样的:
lrwxrwxrwx. 1 root root 9 Oct 17 15:03 pci-0000:03:00.0-sas-0x5fcfffff00000001-lun-0 -> ../../sda
lrwxrwxrwx. 1 root root 9 Oct 17 15:03 pci-0000:03:00.0-sas-0x5fcfffff00000002-lun-0 -> ../../sdb
lrwxrwxrwx. 1 root root 9 Oct 17 15:03 pci-0000:03:00.0-sas-0x5fcfffff00000003-lun-0 -> ../../sdf
lrwxrwxrwx. 1 root root 9 Oct 17 15:03 pci-0000:03:00.0-sas-0x5fcfffff00000004-lun-0 -> ../../sdd
lrwxrwxrwx. 1 root root 9 Oct 17 15:03 pci-0000:03:00.0-sas-phy0-lun-0 -> ../../sda
lrwxrwxrwx. 1 root root 9 Oct 17 15:03 pci-0000:03:00.0-sas-phy1-lun-0 -> ../../sdf
lrwxrwxrwx. 1 root root 9 Oct 17 15:03 pci-0000:03:00.0-sas-phy2-lun-0 -> ../../sdc
lrwxrwxrwx. 1 root root 9 Oct 17 15:03 pci-0000:03:00.0-sas-phy3-lun-0 -> ../../sdd
虽然by-id
路径看起来像这样:
lrwxrwxrwx. 1 root root 9 Oct 17 15:03 ata-WDC_WD6001F9YZ-09YUWL1_WD-WX41DA58PYRS -> ../../sda
lrwxrwxrwx. 1 root root 9 Oct 17 15:03 ata-WDC_WD6001F9YZ-09YUWL1_WD-WX41DA5LV7KR -> ../../sdb
lrwxrwxrwx. 1 root root 9 Oct 17 15:03 ata-WDC_WD6001F9YZ-09YUWL1_WD-WX41DA5LV87E -> ../../sdc
lrwxrwxrwx. 1 root root 9 Oct 17 15:03 ata-WDC_WD6001F9YZ-09YUWL1_WD-WX41DA5LVL4J -> ../../sdd
lrwxrwxrwx. 1 root root 9 Oct 17 15:03 ata-WDC_WD6002FRYZ-01WD5B0_K1HNL2KD -> ../../sde
lrwxrwxrwx. 1 root root 9 Oct 17 15:03 ata-WDC_WD6002FRYZ-01WD5B0_K1JMY3ND -> ../../sdf
磁盘始终分配相同的设备 ID(sda/sdb/等),这令人放心,但路径会发生不可预测的变化(使其完全无法用于我的目的)。
[编辑3]
输出du -a /sys/devices/pci0000\:00/0000:00:01.0/0000:01:00.0/0000:02:08.0/0000:03:00.0 | grep -E 'sd.' | grep -vE 'sd./'
显示了驱动器的正确映射:
0 /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:08.0/0000:03:00.0/host6/port-6:0/end_device-6:0/target6:0:0/6:0:0:0/block/sda
0 /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:08.0/0000:03:00.0/host6/port-6:1/end_device-6:1/target6:0:1/6:0:1:0/block/sdb
0 /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:08.0/0000:03:00.0/host6/port-6:2/end_device-6:2/target6:0:2/6:0:2:0/block/sdc
0 /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:08.0/0000:03:00.0/host6/port-6:3/end_device-6:3/target6:0:3/6:0:3:0/block/sdd
0 /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:08.0/0000:03:00.0/host7/port-7:0/end_device-7:0/target7:0:0/7:0:0:0/block/sde
0 /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:08.0/0000:03:00.0/host7/port-7:1/end_device-7:1/target7:0:1/7:0:1:0/block/sdf
所以我认为这个问题与一个错误有关udev
答案1
我还有一台 Dell T420 服务器,输出udevadm info
如下:
E: DEVPATH=/devices/pci0000:00/0000:00:01.0/0000:08:00.0/host0/target0:0:0/0:0:0:0/block/sda
E: DEVPATH=/devices/pci0000:00/0000:00:01.0/0000:08:00.0/host0/target0:0:1/0:0:1:0/block/sdb
E: DEVPATH=/devices/pci0000:00/0000:00:01.0/0000:08:00.0/host0/target0:0:2/0:0:2:0/block/sdc
E: DEVPATH=/devices/pci0000:00/0000:00:01.0/0000:08:00.0/host0/target0:0:3/0:0:3:0/block/sdd
E: DEVPATH=/devices/pci0000:00/0000:00:01.0/0000:08:00.0/host0/target0:0:4/0:0:4:0/block/sde
E: DEVPATH=/devices/pci0000:00/0000:00:01.0/0000:08:00.0/host0/target0:0:5/0:0:5:0/block/sdf
E: DEVPATH=/devices/pci0000:00/0000:00:01.0/0000:08:00.0/host0/target0:0:6/0:0:6:0/block/sdg
E: DEVPATH=/devices/pci0000:00/0000:00:01.0/0000:08:00.0/host0/target0:0:7/0:0:7:0/block/sdh
host#
如您所见,设备路径中只有一个实例( host0
),而 intel SCU 有两个 (host6
和host7
)。显然 CentOS 7 上的 udev 不知道如何正确处理这个问题,而只是覆盖设备链接(因此该节点下的任何设备都会被该节点下的相应设备覆盖host6
其符号链接)。/dev/disks/by-path
host7
看来我现在需要学习如何编写 udev 规则......
[编辑]
最初尝试使用 udev 规则来解决该问题:https://gist.github.com/dghodgson/49da6175371cdde317e662fb8a7d078a
它非常丑陋且有缺陷。根本无法正确处理热插拔,并且不会对分区执行任何操作。我需要找到一种方法从现有信息创建更新的属性,而不是就地编辑它们,否则每次重新加载 udev 规则时路径都有可能发生变化。
[编辑2]
Gist 已更新为类似于handle_scsi_default
内置函数生成的输出路径path_id
。它现在更加可靠,并且还可以处理分区。希望其他人也会发现它有用。但它仍然是一个黑客,所以 YMMV。
目前正在努力将正确的修复补丁修补到 udev 中。