我在使用 PERC H710P 上的 RAID 1+0 阵列运行 16x 900GB 15K SAS 硬盘时遇到了问题,在正常负载下~450 眼压计(几乎所有的写入)并且有大约 80ms 的积压,当测试速度时,dd if=/dev/zero of=/root/testfile bs=4 count=10000 oflag=dsync
它的最大速度约为800 眼压计有 200ms 的积压。
我尝试禁用 RAID 卡中的 ReadAhead 功能,并将磁盘缓存从磁盘默认到已启用,写入 IOPs 没有改善。
在我们的其他数据库服务器上进行相同的测试会得到更好的结果:
12 个 4 TB 7.2k SAS RAID1 + 0 PERC H710P:~15,000 IOPS
24 个 300 GB 10K SAS RAID1 + 0 PERC H710P:~21,000 IOPS
据我所知,所有驱动器均正常,并且 RAID 卡配置正确,那么导致此问题的其他原因还有哪些?
服务器详细信息
操作系统——Debian 9.8
CPU - 4x Intel E5-4620
内存-256GB
RAID 卡 - PERC H710P
HDD - 16x DL900MP0136(戴尔版本的ST900MP0146)
相关部分来自lspci -vv
02:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 2208 [Thunderbolt] (rev 01)
Subsystem: Dell PERC H710P Adapter
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 38
NUMA node: 0
Region 0: I/O ports at fc00 [size=256]
Region 1: Memory at dd7fc000 (64-bit, non-prefetchable) [size=16K]
Region 3: Memory at dd780000 (64-bit, non-prefetchable) [size=256K]
Expansion ROM at dc800000 [disabled] [size=128K]
Capabilities: [50] Power Management version 3
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [68] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 4096 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported+
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
MaxPayload 256 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s, Exit Latency L0s <64ns, L1 <1us
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range BC, TimeoutDis+, LTR-, OBFF Not Supported
DevCtl2: Completion Timeout: 65ms to 210ms, TimeoutDis-, LTR-, OBFF Disabled
LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
Capabilities: [d0] Vital Product Data
RAID 卡信息megacli -AdpAllInfo -aAll
Adapter #0
==============================================================================
Versions
================
Product Name : PERC H710P Adapter
Serial No : 29P00AB
FW Package Build: 21.3.2-0005
Mfg. Data
================
Mfg. Date : 09/27/12
Rework Date : 09/27/12
Revision No : A00
Battery FRU : N/A
Image Versions in Flash:
================
BIOS Version : 5.42.00.1_4.12.05.00_0x05290003
Ctrl-R Version : 4.04-0003
Preboot CLI Version: 05.00-03:#%00008
FW Version : 3.131.05-4520
NVDATA Version : 2.1108.03-0096
Boot Block Version : 2.03.00.00-0004
BOOT Version : 06.253.57.219
Pending Images in Flash
================
None
PCI Info
================
Controller Id : 0000
Vendor Id : 1000
Device Id : 005b
SubVendorId : 1028
SubDeviceId : 1f31
Host Interface : PCIE
ChipRevision : B0
Link Speed : 2
Number of Frontend Port: 0
Device Interface : PCIE
Number of Backend Port: 8
Port : Address
0 500056b37789abff
1 0000000000000000
2 0000000000000000
3 0000000000000000
4 0000000000000000
5 0000000000000000
6 0000000000000000
7 0000000000000000
HW Configuration
================
SAS Address : 590b11c015d8b800
BBU : Present
Alarm : Absent
NVRAM : Present
Serial Debugger : Present
Memory : Present
Flash : Present
Memory Size : 1024MB
TPM : Absent
On board Expander: Absent
Upgrade Key : Absent
Temperature sensor for ROC : Present
Temperature sensor for controller : Present
ROC temperature : 79 degree Celsius
Controller temperature : 79 degree Celcius
Settings
================
Current Time : 4:1:13 4/14, 2020
Predictive Fail Poll Interval : 300sec
Interrupt Throttle Active Count : 16
Interrupt Throttle Completion : 50us
Rebuild Rate : 30%
PR Rate : 30%
BGI Rate : 30%
Check Consistency Rate : 30%
Reconstruction Rate : 30%
Cache Flush Interval : 4s
Max Drives to Spinup at One Time : 4
Delay Among Spinup Groups : 12s
Physical Drive Coercion Mode : 128MB
Cluster Mode : Disabled
Alarm : Disabled
Auto Rebuild : Enabled
Battery Warning : Enabled
Ecc Bucket Size : 255
Ecc Bucket Leak Rate : 240 Minutes
Restore HotSpare on Insertion : Disabled
Expose Enclosure Devices : Disabled
Maintain PD Fail History : Disabled
Host Request Reordering : Enabled
Auto Detect BackPlane Enabled : SGPIO/i2c SEP
Load Balance Mode : Auto
Use FDE Only : Yes
Security Key Assigned : No
Security Key Failed : No
Security Key Not Backedup : No
Default LD PowerSave Policy : Controller Defined
Maximum number of direct attached drives to spin up in 1 min : 20
Auto Enhanced Import : No
Any Offline VD Cache Preserved : No
Allow Boot with Preserved Cache : No
Disable Online Controller Reset : No
PFK in NVRAM : No
Use disk activity for locate : No
POST delay : 90 seconds
BIOS Error Handling : Stop On Errors
Current Boot Mode :Normal
Capabilities
================
RAID Level Supported : RAID0, RAID1, RAID5, RAID6, RAID00, RAID10, RAID50, RAID60, PRL 11, PRL 11 with spanning, PRL11-RLQ0 DDF layout with no span, PRL11-RLQ0 DDF layout with span
Supported Drives : SAS, SATA
Allowed Mixing:
Mix in Enclosure Allowed
Status
================
ECC Bucket Count : 0
Limitations
================
Max Arms Per VD : 32
Max Spans Per VD : 8
Max Arrays : 128
Max Number of VDs : 64
Max Parallel Commands : 1008
Max SGE Count : 60
Max Data Transfer Size : 8192 sectors
Max Strips PerIO : 42
Max LD per array : 16
Min Strip Size : 64 KB
Max Strip Size : 1.0 MB
Max Configurable CacheCade Size: 512 GB
Current Size of CacheCade : 0 GB
Current Size of FW Cache : 883 MB
Device Present
================
Virtual Drives : 1
Degraded : 0
Offline : 0
Physical Devices : 18
Disks : 16
Critical Disks : 0
Failed Disks : 0
Supported Adapter Operations
================
Rebuild Rate : Yes
CC Rate : Yes
BGI Rate : Yes
Reconstruct Rate : Yes
Patrol Read Rate : Yes
Alarm Control : Yes
Cluster Support : No
BBU : No
Spanning : Yes
Dedicated Hot Spare : Yes
Revertible Hot Spares : Yes
Foreign Config Import : Yes
Self Diagnostic : Yes
Allow Mixed Redundancy on Array : No
Global Hot Spares : Yes
Deny SCSI Passthrough : No
Deny SMP Passthrough : No
Deny STP Passthrough : No
Support Security : Yes
Snapshot Enabled : No
Support the OCE without adding drives : Yes
Support PFK : No
Support PI : No
Support Boot Time PFK Change : No
Disable Online PFK Change : No
Support Shield State : No
Block SSD Write Disk Cache Change: No
Supported VD Operations
================
Read Policy : Yes
Write Policy : Yes
IO Policy : Yes
Access Policy : Yes
Disk Cache Policy : Yes
Reconstruction : Yes
Deny Locate : No
Deny CC : No
Allow Ctrl Encryption: No
Enable LDBBM : Yes
Support Breakmirror : Yes
Power Savings : Yes
Supported PD Operations
================
Force Online : Yes
Force Offline : Yes
Force Rebuild : Yes
Deny Force Failed : No
Deny Force Good/Bad : No
Deny Missing Replace : No
Deny Clear : No
Deny Locate : No
Support Temperature : Yes
NCQ : No
Disable Copyback : Yes
Enable JBOD : No
Enable Copyback on SMART : No
Enable Copyback to SSD on SMART Error : No
Enable SSD Patrol Read : No
PR Correct Unconfigured Areas : Yes
Enable Spin Down of UnConfigured Drives : No
Disable Spin Down of hot spares : Yes
Spin Down time : 30
T10 Power State : Yes
Error Counters
================
Memory Correctable Errors : 0
Memory Uncorrectable Errors : 0
Cluster Information
================
Cluster Permitted : No
Cluster Active : No
Default Settings
================
Phy Polarity : 0
Phy PolaritySplit : 0
Background Rate : 30
Strip Size : 64kB
Flush Time : 4 seconds
Write Policy : WB
Read Policy : Adaptive
Cache When BBU Bad : Disabled
Cached IO : No
SMART Mode : Mode 6
Alarm Disable : No
Coercion Mode : 128MB
ZCR Config : Unknown
Dirty LED Shows Drive Activity : No
BIOS Continue on Error : 0
Spin Down Mode : None
Allowed Device Type : SAS/SATA Mix
Allow Mix in Enclosure : Yes
Allow HDD SAS/SATA Mix in VD : No
Allow SSD SAS/SATA Mix in VD : No
Allow HDD/SSD Mix in VD : No
Allow SATA in Cluster : No
Max Chained Enclosures : 4
Disable Ctrl-R : No
Enable Web BIOS : No
Direct PD Mapping : Yes
BIOS Enumerate VDs : Yes
Restore Hot Spare on Insertion : No
Expose Enclosure Devices : No
Maintain PD Fail History : No
Disable Puncturing : No
Zero Based Enclosure Enumeration : Yes
PreBoot CLI Enabled : No
LED Show Drive Activity : Yes
Cluster Disable : Yes
SAS Disable : No
Auto Detect BackPlane Enable : SGPIO/i2c SEP
Use FDE Only : Yes
Enable Led Header : No
Delay during POST : 0
EnableCrashDump : No
Disable Online Controller Reset : No
EnableLDBBM : Yes
Un-Certified Hard Disk Drives : Allow
Treat Single span R1E as R10 : Yes
Max LD per array : 16
Power Saving option : Don't spin down unconfigured drives
Don't spin down Hot spares
Don't Auto spin down Configured Drives
Power settings apply to all drives - individual PD/LD power settings cannot be set
Max power savings option is not allowed for LDs. Only T10 power conditions are to be used.
Cached writes are not used for spun down VDs
Can schedule disable power savings at controller level
Default spin down time in minutes: 30
Enable JBOD : No
TTY Log In Flash : Yes
Auto Enhanced Import : No
BreakMirror RAID Support : Yes
Disable Join Mirror : Yes
Enable Shield State : No
Time taken to detect CME : 60s
Exit Code: 0x00
电池状态megacli -AdpBbuCmd -GetBbuSTatus -aAll
BBU status for Adapter: 0
BatteryType: BBU
Voltage: 3900 mV
Current: 0 mA
Temperature: 45 C
Battery State: Optimal
BBU Firmware Status:
Charging Status : None
Voltage : OK
Temperature : OK
Learn Cycle Requested : No
Learn Cycle Active : No
Learn Cycle Status : OK
Learn Cycle Timeout : No
I2c Errors Detected : No
Battery Pack Missing : No
Battery Replacement required : No
Remaining Capacity Low : No
Periodic Learn Required : No
Transparent Learn : No
No space to cache offload : No
Pack is about to fail & should be replaced : No
Cache Offload premium feature required : No
Module microcode update required : No
BBU GasGauge Status: 0x0228
Relative State of Charge: 99 %
Charger Status: Complete
Remaining Capacity: 387 mAh
Full Charge Capacity: 391 mAh
isSOHGood: Yes
Exit Code: 0x00
虚拟驱动器信息megacli -LDInfo -Lall -aAll
Adapter 0 -- Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Name :PrimaryStorage
RAID Level : Primary-1, Secondary-0, RAID Level Qualifier-0
Size : 6.544 TB
Sector Size : 512
Mirror Data : 6.544 TB
State : Optimal
Strip Size : 64 KB
Number Of Drives per span:8
Span Depth : 2
Default Cache Policy: WriteBack, ReadAhead, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteBack, ReadAhead, Direct, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy : Enabled
Encryption Type : None
Default Power Savings Policy: Controller Defined
Current Power Savings Policy: None
Can spin up in 1 minute: Yes
LD has drives that support T10 power conditions: Yes
LD's IO profile supports MAX power savings with cached writes: No
Bad Blocks Exist: No
Is VD Cached: Yes
Cache Cade Type : Read Only
Exit Code: 0x00
来自驱动器 0 的 SMART 数据smartctl /dev/sda -a -d megaraid,0
=== START OF INFORMATION SECTION ===
Vendor: SEAGATE
Product: DL900MP0136
Revision: KT55
Compliance: SPC-4
User Capacity: 900,185,481,216 bytes [900 GB]
Logical block size: 512 bytes
Physical block size: 4096 bytes
Formatted with type 2 protection
LU is fully provisioned
Rotation Rate: 15000 rpm
Form Factor: 2.5 inches
Logical Unit id: 0x5000c500b8c86e4f
Serial number: WAG0BWG2
Device type: disk
Transport protocol: SAS (SPL-3)
Local Time is: Tue Apr 14 14:08:12 2020 NZST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Temperature Warning: Disabled or Not Supported
=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK
Current Drive Temperature: 31 C
Drive Trip Temperature: 60 C
Manufactured in week 25 of year 2018
Specified cycle count over device lifetime: 10000
Accumulated start-stop cycles: 13
Specified load-unload count over device lifetime: 300000
Accumulated load-unload cycles: 547
Elements in grown defect list: 0
Vendor (Seagate) cache information
Blocks sent to initiator = 1794649432
Blocks received from initiator = 3529373256
Blocks read from cache and sent to initiator = 120960058
Number of read and write commands whose size <= segment size = 13007345
Number of read and write commands whose size > segment size = 10656
Vendor (Seagate/Hitachi) factory information
number of hours powered up = 12864.37
number of minutes until next internal SMART test = 9
Error counter log:
Errors Corrected by Total Correction Gigabytes Total
ECC rereads/ errors algorithm processed uncorrected
fast | delayed rewrites corrected invocations [10^9 bytes] errors
read: 224258814 0 0 224258814 0 933.218 0
write: 0 0 0 0 0 4097.456 0
verify: 471 0 0 471 0 0.000 0
Non-medium error count: 371