我在 8x pci 插槽中连接了一个 HBA,但它显示为 x4(已降级)。此外,它似乎已被禁用。我认为这些事情不正常,也是我的设置无法正常工作的原因。尝试排除故障并让 HBA 卡与我的 JBOD 机箱配合使用。目前,SAS 电缆出现故障,我认为是 HBA 的问题。
其他背景
- 256 个核心
- 1TB 内存
- 系统框图 - https://www.gigabyte.com/Enterprise/Rack-Server/R181-Z92-rev-B00#Overview
注意:我有 8 根内存条,CPU_0 的每个通道各一根。其他 24 个插槽是空的。我之所以记下这一点,是因为不确定这是否会产生影响。
查找我的 HBA 卡
SAS 9305-16e 主机总线适配器
root@EPY00:~# lspci | grep -i broad
c1:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3 (rev 01)
在 dmesg 中搜索我的 HBA 卡。发现我的卡功能受限。不知道为什么。
root@EPY00:~# dmesg | grep c1:00.0
[ 2.337229] pci 0000:c1:00.0: [1000:00c9] type 00 class 0x010700
[ 2.337241] pci 0000:c1:00.0: reg 0x10: [io 0xd000-0xd0ff]
[ 2.337252] pci 0000:c1:00.0: reg 0x14: [mem 0x9c100000-0x9c10ffff 64bit]
[ 2.337274] pci 0000:c1:00.0: reg 0x30: [mem 0x9c000000-0x9c0fffff pref]
[ 2.337361] pci 0000:c1:00.0: supports D1 D2
[ 2.337410] pci 0000:c1:00.0: 31.504 Gb/s available PCIe bandwidth, limited by 8.0 GT/s PCIe x4 link at 0000:c0:01.1 (capable of 63.008 Gb/s with 8.0 GT/s PCIe x8 link)
[ 2.780056] pci 0000:c1:00.0: Adding to iommu group 87
[ 4.159479] mpt3sas 0000:c1:00.0: enabling device (0000 -> 0002)
观察链接。我发现链接被降级了。不确定这意味着什么。猜测这可能是根本问题?
root@EPY00:~# lspci -vv -s 0000:c0:01.1
c0:01.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin ? routed to IRQ 70
NUMA node: 1
IOMMU group: 87
Bus: primary=c0, secondary=c1, subordinate=c1, sec-latency=0
I/O behind bridge: 0000d000-0000dfff [size=4K]
Memory behind bridge: 9c000000-9c1fffff [size=2M]
Prefetchable memory behind bridge: [disabled]
Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
BridgeCtl: Parity- SERR+ NoISA- VGA- VGA16+ MAbort- >Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: [50] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [58] Express (v2) Root Port (Slot+), MSI 00
DevCap: MaxPayload 512 bytes, PhantFunc 0
ExtTag+ RBE+
DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq-
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
MaxPayload 512 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 16GT/s, Width x8, ASPM L1, Exit Latency L1 <64us
ClockPM- Surprise- LLActRep+ BwNot+ ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 8GT/s (downgraded), Width x4 (downgraded)
TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt-
SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise-
Slot #3, PowerLimit 75.000W; Interlock- NoCompl+
SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-
SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock-
Changed: MRL- PresDet- LinkState-
RootCap: CRSVisible+
RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna+ CRSVisible+
RootSta: PME ReqID 0000, PMEStatus- PMEPending-
DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP- LTR-
10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt+ EETLPPrefix+, MaxEETLPPrefixes 1
EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
FRS- LN System CLS Not Supported, TPHComp- ExtTPHComp- ARIFwd+
AtomicOpsCap: Routing- 32bit+ 64bit+ 128bitCAS-
DevCtl2: Completion Timeout: 65ms to 210ms, TimeoutDis- LTR- OBFF Disabled, ARIFwd+
AtomicOpsCtl: ReqEn- EgressBlck-
LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
LnkCtl2: Target Link Speed: 16GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+ EqualizationPhase1+
EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
Retimer- 2Retimers- CrosslinkRes: unsupported
Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Address: 00000000fee00000 Data: 0000
Capabilities: [c0] Subsystem: Gigabyte Technology Co., Ltd Starship/Matisse GPP Bridge
Capabilities: [c8] HyperTransport: MSI Mapping Enable+ Fixed+
Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
Capabilities: [270 v1] Secondary PCI Express
LnkCtl3: LnkEquIntrruptEn- PerformEqu-
LaneErrStat: 0
Capabilities: [370 v1] L1 PM Substates
L1SubCap: PCI-PM_L1.2- PCI-PM_L1.1+ ASPM_L1.2- ASPM_L1.1+ L1_PM_Substates+
L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
L1SubCtl2:
Capabilities: [380 v1] Downstream Port Containment
DpcCap: INT Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 6, DL_ActiveErr+
DpcCtl: Trigger:0 Cmpl- INT- ErrCor- PoisonedTLP- SwTrigger- DL_ActiveErr-
DpcSta: Trigger- Reason:00 INT- RPBusy- TriggerExt:00 RP PIO ErrPtr:1f
Source: 0000
Capabilities: [400 v1] Data Link Feature <?>
Capabilities: [410 v1] Physical Layer 16.0 GT/s <?>
Capabilities: [440 v1] Lane Margining at the Receiver <?>
Kernel driver in use: pcieport
可能不相关,但也观察到以下内容。
root@EPY00:~# dmesg | grep -i pci | grep -i bar
[ 2.314469] pci 0000:63:00.0: VF(n) BAR0 space: [mem 0x18090f60000-0x18090f7ffff 64bit pref] (contains BAR0 for 8 VFs)
[ 2.314469] pci 0000:63:00.0: VF(n) BAR3 space: [mem 0x18090f40000-0x18090f5ffff 64bit pref] (contains BAR3 for 8 VFs)
[ 2.314648] pci 0000:63:00.1: VF(n) BAR0 space: [mem 0x18090f20000-0x18090f3ffff 64bit pref] (contains BAR0 for 8 VFs)
[ 2.314668] pci 0000:63:00.1: VF(n) BAR3 space: [mem 0x18090f00000-0x18090f1ffff 64bit pref] (contains BAR3 for 8 VFs)
[ 2.326651] pci 0000:66:00.0: BAR 0: assigned to efifb
[ 2.381614] pci 0000:00:01.1: BAR 14: assigned [mem 0xf6000000-0xf61fffff]
[ 2.381616] pci 0000:00:01.1: BAR 15: assigned [mem 0x300f1000000-0x300f11fffff 64bit pref]
[ 2.381617] pci 0000:00:01.2: BAR 14: assigned [mem 0xf6200000-0xf63fffff]
[ 2.381619] pci 0000:00:01.2: BAR 15: assigned [mem 0x300f1200000-0x300f13fffff 64bit pref]
[ 2.381622] pci 0000:00:01.1: BAR 13: assigned [io 0x1000-0x1fff]
[ 2.381623] pci 0000:00:01.2: BAR 13: assigned [io 0x2000-0x2fff]
[ 2.381826] pci 0000:60:03.1: BAR 15: assigned [mem 0x10091000000-0x100911fffff 64bit pref]
[ 2.381828] pci 0000:60:03.2: BAR 15: assigned [mem 0x10091200000-0x100913fffff 64bit pref]
[ 2.381829] pci 0000:60:03.1: BAR 13: no space for [io size 0x1000]
[ 2.381830] pci 0000:60:03.1: BAR 13: failed to assign [io size 0x1000]
[ 2.381831] pci 0000:60:03.2: BAR 13: no space for [io size 0x1000]
[ 2.381832] pci 0000:60:03.2: BAR 13: failed to assign [io size 0x1000]
[ 2.381833] pci 0000:60:03.2: BAR 13: no space for [io size 0x1000]
[ 2.381834] pci 0000:60:03.2: BAR 13: failed to assign [io size 0x1000]
[ 2.381835] pci 0000:60:03.1: BAR 13: no space for [io size 0x1000]
[ 2.381835] pci 0000:60:03.1: BAR 13: failed to assign [io size 0x1000]
[ 2.381947] pci 0000:80:01.1: BAR 14: assigned [mem 0x90000000-0x901fffff]
[ 2.381949] pci 0000:80:01.1: BAR 15: assigned [mem 0x581b1000000-0x581b11fffff 64bit pref]
[ 2.381949] pci 0000:80:01.2: BAR 14: assigned [mem 0x90200000-0x903fffff]
[ 2.381951] pci 0000:80:01.2: BAR 15: assigned [mem 0x581b1200000-0x581b13fffff 64bit pref]
[ 2.381952] pci 0000:80:01.1: BAR 13: assigned [io 0x9000-0x9fff]
[ 2.381952] pci 0000:80:01.2: BAR 13: assigned [io 0xa000-0xafff]
[ 2.382039] pci 0000:a0:03.1: BAR 15: assigned [mem 0x501b1000000-0x501b11fffff 64bit pref]
[ 2.382040] pci 0000:a0:03.2: BAR 15: assigned [mem 0x501b1200000-0x501b13fffff 64bit pref]
[ 2.382041] pci 0000:a0:03.3: BAR 14: assigned [mem 0x96000000-0x961fffff]
[ 2.382042] pci 0000:a0:03.3: BAR 15: assigned [mem 0x501b1400000-0x501b15fffff 64bit pref]
[ 2.382043] pci 0000:a0:03.4: BAR 14: assigned [mem 0x96200000-0x963fffff]
[ 2.382044] pci 0000:a0:03.4: BAR 15: assigned [mem 0x501b1600000-0x501b17fffff 64bit pref]
[ 2.382045] pci 0000:a0:03.1: BAR 13: assigned [io 0xc000-0xcfff]
[ 2.382046] pci 0000:a0:03.2: BAR 13: no space for [io size 0x1000]
[ 2.382046] pci 0000:a0:03.2: BAR 13: failed to assign [io size 0x1000]
[ 2.382047] pci 0000:a0:03.3: BAR 13: no space for [io size 0x1000]
[ 2.382048] pci 0000:a0:03.3: BAR 13: failed to assign [io size 0x1000]
[ 2.382049] pci 0000:a0:03.4: BAR 13: no space for [io size 0x1000]
[ 2.382049] pci 0000:a0:03.4: BAR 13: failed to assign [io size 0x1000]
[ 2.382051] pci 0000:a0:03.4: BAR 13: assigned [io 0xc000-0xcfff]
[ 2.382052] pci 0000:a0:03.3: BAR 13: no space for [io size 0x1000]
[ 2.382053] pci 0000:a0:03.3: BAR 13: failed to assign [io size 0x1000]
[ 2.382054] pci 0000:a0:03.2: BAR 13: no space for [io size 0x1000]
[ 2.382054] pci 0000:a0:03.2: BAR 13: failed to assign [io size 0x1000]
[ 2.382055] pci 0000:a0:03.1: BAR 13: no space for [io size 0x1000]
[ 2.382056] pci 0000:a0:03.1: BAR 13: failed to assign [io size 0x1000]
[ 2.382218] pci 0000:e0:03.1: BAR 14: assigned [mem 0xa0000000-0xa01fffff]
[ 2.382219] pci 0000:e0:03.1: BAR 15: assigned [mem 0x40151000000-0x401511fffff 64bit pref]
[ 2.382220] pci 0000:e0:03.2: BAR 14: assigned [mem 0xa0200000-0xa03fffff]
[ 2.382222] pci 0000:e0:03.2: BAR 15: assigned [mem 0x40151200000-0x401513fffff 64bit pref]
[ 2.382222] pci 0000:e0:03.1: BAR 13: assigned [io 0xe000-0xefff]
[ 2.382223] pci 0000:e0:03.2: BAR 13: no space for [io size 0x1000]
[ 2.382224] pci 0000:e0:03.2: BAR 13: failed to assign [io size 0x1000]
[ 2.382225] pci 0000:e0:03.2: BAR 13: assigned [io 0xe000-0xefff]
[ 2.382226] pci 0000:e0:03.1: BAR 13: no space for [io size 0x1000]
[ 2.382227] pci 0000:e0:03.1: BAR 13: failed to assign [io size 0x1000]
对驾驶员的奖励命令
root@EPY00:~# modinfo mpt3sas
filename: /lib/modules/5.10.0-9-amd64/kernel/drivers/scsi/mpt3sas/mpt3sas.ko
alias: mpt2sas
version: 35.100.00.00
license: GPL
description: LSI MPT Fusion SAS 3.0 Device Driver
author: Avago Technologies <[email protected]>
srcversion: 2D6BBDB9CE0F1B2FA0B159D
alias: pci:v00001000d000000E7sv*sd*bc*sc*i*
alias: pci:v00001000d000000E4sv*sd*bc*sc*i*
alias: pci:v00001000d000000E6sv*sd*bc*sc*i*
alias: pci:v00001000d000000E5sv*sd*bc*sc*i*
alias: pci:v00001000d000000B2sv*sd*bc*sc*i*
alias: pci:v00001000d000000E3sv*sd*bc*sc*i*
alias: pci:v00001000d000000E0sv*sd*bc*sc*i*
alias: pci:v00001000d000000E2sv*sd*bc*sc*i*
alias: pci:v00001000d000000E1sv*sd*bc*sc*i*
alias: pci:v00001000d000000D1sv*sd*bc*sc*i*
alias: pci:v00001000d000000ACsv*sd*bc*sc*i*
alias: pci:v00001000d000000ABsv*sd*bc*sc*i*
alias: pci:v00001000d000000AAsv*sd*bc*sc*i*
alias: pci:v00001000d000000AFsv*sd*bc*sc*i*
alias: pci:v00001000d000000AEsv*sd*bc*sc*i*
alias: pci:v00001000d000000ADsv*sd*bc*sc*i*
alias: pci:v00001000d000000C3sv*sd*bc*sc*i*
alias: pci:v00001000d000000C2sv*sd*bc*sc*i*
alias: pci:v00001000d000000C1sv*sd*bc*sc*i*
alias: pci:v00001000d000000C0sv*sd*bc*sc*i*
alias: pci:v00001000d000000C8sv*sd*bc*sc*i*
alias: pci:v00001000d000000C7sv*sd*bc*sc*i*
alias: pci:v00001000d000000C6sv*sd*bc*sc*i*
alias: pci:v00001000d000000C5sv*sd*bc*sc*i*
alias: pci:v00001000d000000C4sv*sd*bc*sc*i*
alias: pci:v00001000d000000C9sv*sd*bc*sc*i*
alias: pci:v00001000d00000095sv*sd*bc*sc*i*
alias: pci:v00001000d00000094sv*sd*bc*sc*i*
alias: pci:v00001000d00000091sv*sd*bc*sc*i*
alias: pci:v00001000d00000090sv*sd*bc*sc*i*
alias: pci:v00001000d00000097sv*sd*bc*sc*i*
alias: pci:v00001000d00000096sv*sd*bc*sc*i*
alias: pci:v00001000d0000007Esv*sd*bc*sc*i*
alias: pci:v00001000d000002B1sv*sd*bc*sc*i*
alias: pci:v00001000d000002B0sv*sd*bc*sc*i*
alias: pci:v00001000d0000006Esv*sd*bc*sc*i*
alias: pci:v00001000d00000087sv*sd*bc*sc*i*
alias: pci:v00001000d00000086sv*sd*bc*sc*i*
alias: pci:v00001000d00000085sv*sd*bc*sc*i*
alias: pci:v00001000d00000084sv*sd*bc*sc*i*
alias: pci:v00001000d00000083sv*sd*bc*sc*i*
alias: pci:v00001000d00000082sv*sd*bc*sc*i*
alias: pci:v00001000d00000081sv*sd*bc*sc*i*
alias: pci:v00001000d00000080sv*sd*bc*sc*i*
alias: pci:v00001000d00000065sv*sd*bc*sc*i*
alias: pci:v00001000d00000064sv*sd*bc*sc*i*
alias: pci:v00001000d00000077sv*sd*bc*sc*i*
alias: pci:v00001000d00000076sv*sd*bc*sc*i*
alias: pci:v00001000d00000074sv*sd*bc*sc*i*
alias: pci:v00001000d00000072sv*sd*bc*sc*i*
alias: pci:v00001000d00000070sv*sd*bc*sc*i*
depends: scsi_mod,scsi_transport_sas,raid_class
retpoline: Y
intree: Y
name: mpt3sas
vermagic: 5.10.0-9-amd64 SMP mod_unload modversions
sig_id: PKCS#7
signer: Debian Secure Boot CA
sig_key: 4B:6E:F5:AB:CA:66:98:25:17:8E:05:2C:84:66:7C:CB:C0:53:1F:8C
sig_hashalgo: sha256
signature: 96:D9:EB:25:37:10:96:E1:BD:55:F1:66:9C:87:2A:C1:E8:B1:9A:A1:
28:42:A8:DD:EF:25:B8:DF:BA:1D:B2:FC:E5:45:42:6D:DC:2B:77:02:
6A:55:29:F0:08:04:3E:A2:42:53:1E:F8:F0:EF:07:4F:D0:F4:74:93:
35:3E:E3:1E:AC:01:25:0F:87:4D:94:71:B1:6D:1C:4B:10:EF:C3:6E:
BA:B5:58:37:19:CC:35:99:CB:1C:00:35:60:4A:39:CA:8E:53:99:40:
3C:03:FE:4A:FE:44:2E:72:F6:F3:62:FC:89:CA:4A:88:C3:83:A6:D2:
66:56:47:FA:FC:47:1D:F7:E1:FB:2D:A9:DD:E2:E2:B8:BC:19:A7:64:
51:99:36:FD:53:6A:40:5B:75:A3:03:57:4E:6C:03:62:D1:BC:68:31:
E2:52:71:75:69:92:E4:72:BB:21:7E:F5:D3:E4:27:1C:95:25:36:00:
8E:63:02:CB:D3:4E:9B:03:D2:A7:A0:BD:43:93:3C:32:E0:F1:8D:E9:
EA:D0:6B:56:1B:C6:61:43:97:4B:EB:57:B7:1D:FB:EA:4B:5F:DA:1E:
A1:9F:9E:E3:C8:7A:6F:4A:A5:82:7C:51:05:78:4E:25:BF:74:4E:A6:
FC:86:1C:CD:52:37:D5:9E:83:41:C9:0F:1A:5D:1C:EB
parm: logging_level: bits for enabling additional logging info (default=0)
parm: max_sectors:max sectors, range 64 to 32767 default=32767 (ushort)
parm: missing_delay: device missing delay , io missing delay (array of int)
parm: max_lun: max lun, default=16895 (ullong)
parm: hbas_to_enumerate: 0 - enumerates both SAS 2.0 & SAS 3.0 generation HBAs
1 - enumerates only SAS 2.0 generation HBAs
2 - enumerates only SAS 3.0 generation HBAs (default=0) (ushort)
parm: diag_buffer_enable: post diag buffers (TRACE=1/SNAPSHOT=2/EXTENDED=4/default=0) (int)
parm: disable_discovery: disable discovery (int)
parm: prot_mask: host protection capabilities mask, def=7 (int)
parm: enable_sdev_max_qd:Enable sdev max qd as can_queue, def=disabled(0) (bool)
parm: max_queue_depth: max controller queue depth (int)
parm: max_sgl_entries: max sg entries (int)
parm: msix_disable: disable msix routed interrupts (default=0) (int)
parm: smp_affinity_enable:SMP affinity feature enable/disable Default: enable(1) (int)
parm: max_msix_vectors: max msix vectors (int)
parm: irqpoll_weight:irq poll weight (default= one fourth of HBA queue depth) (int)
parm: mpt3sas_fwfault_debug: enable detection of firmware fault and halt firmware - (default=0)
parm: perf_mode:Performance mode (only for Aero/Sea Generation), options:
0 - balanced: high iops mode is enabled &
interrupt coalescing is enabled only on high iops queues,
1 - iops: high iops mode is disabled &
interrupt coalescing is enabled on all queues,
2 - latency: high iops mode is disabled &
interrupt coalescing is enabled on all queues with timeout value 0xA,
default - default perf_mode is 'balanced' (int)