好的,这是我的设置:
FC 交换机IBM/Brocade,Switch1 和 Switch2,独立结构。
服务器IBM x3650 M2,2x QLogic QLE2460,每个 FC 交换机连接 1 个。
贮存IBM DS3524,2 个控制器,每个控制器有 4 个 FC 端口,但每个控制器仅连接 2 个。
+-----------------------------------------------------------------------+
| HBA1 Server HBA2 |
+-----------------------------------------------------------------------+
| |
| |
| |
+-----------------------------+ +------------------------------+
| Switch1 | | Switch2 |
+-----------------------------+ +------------------------------+
| | | |
| | | |
| | | |
| | | |
| | | |
+-----------------------------------+-----------------------------------+
| Contr A, port 3 | Contr A, port 4 | Contr B, port 3 | Contr B, port 4 |
+-----------------------------------+-----------------------------------+
| Storage |
+-----------------------------------------------------------------------+
我的 /etc/multipath.conf 来自 IBM DS3500 红皮书,但我使用了不同的设置prio_callout,IBM 使用/sbin/mpath_prio_tpc,但根据http://changelogs.ubuntu.com/changelogs/pool/main/m/multipath-tools/multipath-tools_0.4.8-7ubuntu2/changelog,现已重命名为/sbin/mpath_prio_rdac,我正在使用。
devices {
device {
#ds3500
vendor "IBM"
product "1746 FAStT"
hardware_handler "1 rdac"
path_checker rdac
failback 0
path_grouping_policy multibus
prio_callout "/sbin/mpath_prio_rdac /dev/%n"
}
}
multipaths {
multipath {
wwid xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
alias array07
path_grouping_policy multibus
path_checker readsector0
path_selector "round-robin 0"
failback "5"
rr_weight priorities
no_path_retry "5"
}
}
multipath -ll
以控制器 A 作为首选路径的输出:
root@db06:~# multipath -ll
sdg: checker msg is "directio checker reports path is down"
sdh: checker msg is "directio checker reports path is down"
array07 (xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx) dm-2 IBM ,1746 FASt
[size=4.9T][features=1 queue_if_no_path][hwhandler=0]
\_ round-robin 0 [prio=2][active]
\_ 5:0:1:0 sdd 8:48 [active][ready]
\_ 5:0:2:0 sde 8:64 [active][ready]
\_ 6:0:1:0 sdg 8:96 [failed][faulty]
\_ 6:0:2:0 sdh 8:112 [failed][faulty]
如果我使用 IBM DS Storage Manager 将首选路径更改为控制器 B,则输出会相应交换:
root@db06:~# multipath -ll
sdd: checker msg is "directio checker reports path is down"
sde: checker msg is "directio checker reports path is down"
array07 (xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx) dm-2 IBM ,1746 FASt
[size=4.9T][features=1 queue_if_no_path][hwhandler=0]
\_ round-robin 0 [prio=2][active]
\_ 5:0:1:0 sdd 8:48 [failed][faulty]
\_ 5:0:2:0 sde 8:64 [failed][faulty]
\_ 6:0:1:0 sdg 8:96 [active][ready]
\_ 6:0:2:0 sdh 8:112 [active][ready]
按照 IBM 的说法,非活动路径应该是“[active][ghost]”,而不是“[failed][faulty]”。
尽管如此,我似乎没有任何 I/O 问题,但我的系统日志每 5 秒钟就会被发送一次这样的垃圾邮件:
Jun 1 15:30:09 db06 multipathd: sdg: directio checker reports path is down
Jun 1 15:30:09 db06 kernel: [ 2350.282065] sd 6:0:2:0: [sdh] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Jun 1 15:30:09 db06 kernel: [ 2350.282071] sd 6:0:2:0: [sdh] Sense Key : Illegal Request [current]
Jun 1 15:30:09 db06 kernel: [ 2350.282076] sd 6:0:2:0: [sdh] <<vendor>> ASC=0x94 ASCQ=0x1ASC=0x94 ASCQ=0x1
Jun 1 15:30:09 db06 kernel: [ 2350.282083] sd 6:0:2:0: [sdh] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00
Jun 1 15:30:09 db06 kernel: [ 2350.282092] end_request: I/O error, dev sdh, sector 0
Jun 1 15:30:10 db06 multipathd: sdh: directio checker reports path is down
Jun 1 15:30:14 db06 kernel: [ 2355.312270] sd 6:0:1:0: [sdg] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Jun 1 15:30:14 db06 kernel: [ 2355.312277] sd 6:0:1:0: [sdg] Sense Key : Illegal Request [current]
Jun 1 15:30:14 db06 kernel: [ 2355.312282] sd 6:0:1:0: [sdg] <<vendor>> ASC=0x94 ASCQ=0x1ASC=0x94 ASCQ=0x1
Jun 1 15:30:14 db06 kernel: [ 2355.312290] sd 6:0:1:0: [sdg] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00
Jun 1 15:30:14 db06 kernel: [ 2355.312299] end_request: I/O error, dev sdg, sector 0
有人知道我该如何让非活动路径显示“[active][ghost]”而不是“[failed][faulty]”吗?我认为一旦我能做到这一点,我的系统日志中的垃圾邮件也将停止。
最后值得一提的是,IBM 红皮书文档针对的是 SLES 11,因此我假设 Ubuntu 下存在一些不同,只是我还没有弄清楚。
更新:作为由 Mitch 建议,我尝试删除 /etc/multipath.conf,现在的输出multipath -ll
如下所示:
root@db06:~# multipath -ll
sdg: checker msg is "directio checker reports path is down"
sdh: checker msg is "directio checker reports path is down"
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxdm-1 IBM ,1746 FASt
[size=4.9T][features=0][hwhandler=0]
\_ round-robin 0 [prio=1][active]
\_ 5:0:2:0 sde 8:64 [active][ready]
\_ round-robin 0 [prio=1][enabled]
\_ 5:0:1:0 sdd 8:48 [active][ready]
\_ round-robin 0 [prio=0][enabled]
\_ 6:0:1:0 sdg 8:96 [failed][faulty]
\_ round-robin 0 [prio=0][enabled]
\_ 6:0:2:0 sdh 8:112 [failed][faulty]
因此,它或多或少是相同的,系统日志中每 5 分钟出现一次相同的消息,与以前一样,但分组已经改变。
答案1
我确实不太了解 IBM DS3524。但如果它与 EMC Clariion CX-4 类似,那么:
尝试将其移multipath.conf
离 etc 文件夹并重新启动multipath-tools
。
答案2
我终于在这里找到了一个可行的配置:http://pig.made-it.com/multipath.html。在此总结一下,以防将来其他人遇到这种情况。
device {
vendor "IBM"
product "1745|1746"
path_grouping_policy group_by_prio
# getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
path_selector "round-robin 0"
path_checker rdac
features "2 pg_init_retries 50"
hardware_handler "1 rdac"
prio_callout "/sbin/mpath_prio_rdac /dev/%n"
failback immediate
rr_weight uniform
no_path_retry 15
rr_min_io 1000
}
我之所以评论是getuid_callout
因为 Ubuntu 将scsi_id
程序放在 中/lib/udev
,而不是/sbin
,并且 Ubuntu 的默认值getuid_callout
是/lib/udev/scsi_id -g -u -s
,所以我没有将其明确放入我的配置中。
块中不需要任何特殊设置multipath { ... }
,但如果需要,您显然可以使用以下命令创建友好的别名:
multipaths {
multipath {
wwid xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
alias array07
}
}
multipath -ll
输出现在如下所示:
root@db06:~# multipath -ll
array07 (xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx) dm-1 IBM ,1746 FASt
[size=4.9T][features=3 queue_if_no_path pg_init_retries 50][hwhandler=1 rdac]
\_ round-robin 0 [prio=0][enabled]
\_ 5:0:1:0 sdd 8:48 [active][ghost]
\_ 5:0:2:0 sde 8:64 [active][ghost]
\_ round-robin 0 [prio=6][active]
\_ 6:0:1:0 sdg 8:96 [active][ready]
\_ 6:0:2:0 sdh 8:112 [active][ready]
我不再在 /var/log/syslog 中收到垃圾邮件,并且故障转移/故障回复运行正常。