我们有一台存储机(文件服务器),采用 LSI Megaraid 9260-16i RAID 系统,15 块硬盘配置为 RAID5,外加一块即将更换的离线硬盘。每块硬盘都是 4TB SATA3 磁盘。最近我们发现 RAID5 中的 3 块在线硬盘发生了以下相同的“巡检读取”错误事件日志:
Code: 0x0000005f
Class: 3
Locale: 0x02
Event Description: Patrol Read found an uncorrectable medium error on PD 1b(e0xf5/s15) at e0377d38
Event Data:
Device ID: 27
Enclosure Index: 245
Slot Number: 15
LBA: 3761732920
Code: 0x0000005f
Class: 3
Locale: 0x02
Event Description: Patrol Read found an uncorrectable medium error on PD 11(e0xf5/s5) at e0377d38
Event Data:
Device ID: 17
Enclosure Index: 245
Slot Number: 5
LBA: 3761732920
Code: 0x0000005f
Class: 3
Locale: 0x02
Event Description: Patrol Read found an uncorrectable medium error on PD 15(e0xf5/s12) at e0377d38
Event Data:
Device ID: 21
Enclosure Index: 245
Slot Number: 12
LBA: 3761732920
很奇怪的是,不可修复的介质错误在三块硬盘(e0xf5/s15)、(e0xf5/s5)和(e0xf5/s12)的同一个地址 e0377d38 处出现。这导致 RAID 卡一次又一次地自动启动后台初始化。从事件日志来看,RAID 卡似乎尝试修复此错误,但总是失败。因此我暂时中止了后台初始化。
我们不太清楚为什么会出现这个问题。最近我们更换了上述三块硬盘中的两块,即(e0xf5/s5)和(e0xf5/s12),因为大约三周前原来的硬盘坏了。看来是在更换硬盘后才出现这个问题的。
有人能建议我如何解决这个问题吗?如果无法修复会有什么后果?
最后附上这三个硬盘驱动器的 -AdpAllInfo、-LDInfo 和 -PDList 列表以供进一步了解。非常感谢您的帮助。
泰和长荣
-AdpAllInfo
Adapter #0
==============================================================================
Versions
================
Product Name : LSI MegaRAID SAS 9260-16i
Serial No : SV21116679
FW Package Build: 12.9.0-0038
Mfg. Data
================
Mfg. Date : 03/17/12
Rework Date : 00/00/00
Revision No : 20C
Battery FRU : N/A
Image Versions in Flash:
================
BIOS Version : 3.18.00_4.09.05.00_0x0416A000
FW Version : 2.90.03-0933
Preboot CLI Version: 04.04-010:#%00008
WebBIOS Version : 6.0-18-e_13-Rel
NVDATA Version : 2.06.03-0010
Boot Block Version : 2.02.00.00-0000
BOOT Version : 01.250.04.219
Pending Images in Flash
================
None
PCI Info
================
Controller Id : 0000
Vendor Id : 1000
Device Id : 0079
SubVendorId : 1000
SubDeviceId : 9276
Host Interface : PCIE
Number of Frontend Port: 0
Device Interface : PCIE
Number of Backend Port: 8
Port : Address
0 500062b20037c5ff
1 0000000000000000
2 0000000000000000
3 0000000000000000
4 0000000000000000
5 0000000000000000
6 0000000000000000
7 0000000000000000
HW Configuration
================
SAS Address : 500062b20037c5c0
BBU : Absent
Alarm : Present
NVRAM : Present
Serial Debugger : Present
Memory : Present
Flash : Present
Memory Size : 512MB
TPM : Absent
On board Expander: Present
Upgrade Key : Absent
Temperature sensor for ROC : Absent
Temperature sensor for controller : Absent
On board Expander FW version : 25.05.04.00
Settings
================
Current Time : 17:15:57 3/3, 2021
Predictive Fail Poll Interval : 300sec
Interrupt Throttle Active Count : 16
Interrupt Throttle Completion : 50us
Rebuild Rate : 30%
PR Rate : 30%
BGI Rate : 30%
Check Consistency Rate : 30%
Reconstruction Rate : 30%
Cache Flush Interval : 4s
Max Drives to Spinup at One Time : 24
Delay Among Spinup Groups : 2s
Physical Drive Coercion Mode : Disabled
Cluster Mode : Disabled
Alarm : Disabled
Auto Rebuild : Enabled
Battery Warning : Disabled
Ecc Bucket Size : 15
Ecc Bucket Leak Rate : 1440 Minutes
Restore HotSpare on Insertion : Disabled
Expose Enclosure Devices : Enabled
Maintain PD Fail History : Enabled
Host Request Reordering : Enabled
Auto Detect BackPlane Enabled : SGPIO/i2c SEP
Load Balance Mode : Auto
Use FDE Only : No
Security Key Assigned : No
Security Key Failed : No
Security Key Not Backedup : No
Default LD PowerSave Policy : Controller Defined
Maximum number of direct attached drives to spin up in 1 min : 0
Auto Enhanced Import : No
Any Offline VD Cache Preserved : No
Allow Boot with Preserved Cache : No
Disable Online Controller Reset : No
PFK in NVRAM : No
Use disk activity for locate : No
POST delay : 90 seconds
Capabilities
================
RAID Level Supported : RAID0, RAID1, RAID5, RAID6, RAID00, RAID10, RAID50, RAID60, PRL 11, PRL 11 with spanning, SRL 3 supported, PRL11-RLQ0 DDF layout with no span, PRL11-RLQ0 DDF layout with span
Supported Drives : SAS, SATA
Allowed Mixing:
Mix in Enclosure Allowed
Mix of SAS/SATA of HDD type in VD Allowed
Status
================
ECC Bucket Count : 0
Limitations
================
Max Arms Per VD : 32
Max Spans Per VD : 8
Max Arrays : 128
Max Number of VDs : 64
Max Parallel Commands : 1008
Max SGE Count : 60
Max Data Transfer Size : 8192 sectors
Max Strips PerIO : 42
Max LD per array : 16
Min Strip Size : 8 KB
Max Strip Size : 1.0 MB
Max Configurable CacheCade Size: 0 GB
Current Size of CacheCade : 0 GB
Current Size of FW Cache : 431 MB
Device Present
================
Virtual Drives : 1
Degraded : 0
Offline : 0
Physical Devices : 17
Disks : 16
Critical Disks : 0
Failed Disks : 0
Supported Adapter Operations
================
Rebuild Rate : Yes
CC Rate : Yes
BGI Rate : Yes
Reconstruct Rate : Yes
Patrol Read Rate : Yes
Alarm Control : Yes
Cluster Support : No
BBU : Yes
Spanning : Yes
Dedicated Hot Spare : Yes
Revertible Hot Spares : Yes
Foreign Config Import : Yes
Self Diagnostic : Yes
Allow Mixed Redundancy on Array : No
Global Hot Spares : Yes
Deny SCSI Passthrough : No
Deny SMP Passthrough : No
Deny STP Passthrough : No
Support Security : No
Snapshot Enabled : No
Support the OCE without adding drives : Yes
Support PFK : No
Support PI : No
Support Boot Time PFK Change : No
Disable Online PFK Change : No
Support Shield State : No
Block SSD Write Disk Cache Change: No
Supported VD Operations
================
Read Policy : Yes
Write Policy : Yes
IO Policy : Yes
Access Policy : Yes
Disk Cache Policy : Yes
Reconstruction : Yes
Deny Locate : No
Deny CC : No
Allow Ctrl Encryption: No
Enable LDBBM : No
Support Breakmirror : No
Power Savings : No
Supported PD Operations
================
Force Online : Yes
Force Offline : Yes
Force Rebuild : Yes
Deny Force Failed : No
Deny Force Good/Bad : No
Deny Missing Replace : No
Deny Clear : No
Deny Locate : No
Support Temperature : No
Disable Copyback : No
Enable JBOD : No
Enable Copyback on SMART : No
Enable Copyback to SSD on SMART Error : Yes
Enable SSD Patrol Read : No
PR Correct Unconfigured Areas : Yes
Enable Spin Down of UnConfigured Drives : Yes
Disable Spin Down of hot spares : Yes
Spin Down time : 30
T10 Power State : No
Error Counters
================
Memory Correctable Errors : 0
Memory Uncorrectable Errors : 0
Cluster Information
================
Cluster Permitted : No
Cluster Active : No
Default Settings
================
Phy Polarity : 0
Phy PolaritySplit : 0
Background Rate : 30
Strip Size : 64kB
Flush Time : 4 seconds
Write Policy : WB
Read Policy : Adaptive
Cache When BBU Bad : Disabled
Cached IO : No
SMART Mode : Mode 6
Alarm Disable : Yes
Coercion Mode : None
ZCR Config : Unknown
Dirty LED Shows Drive Activity : No
BIOS Continue on Error : No
Spin Down Mode : None
Allowed Device Type : SAS/SATA Mix
Allow Mix in Enclosure : Yes
Allow HDD SAS/SATA Mix in VD : Yes
Allow SSD SAS/SATA Mix in VD : No
Allow HDD/SSD Mix in VD : No
Allow SATA in Cluster : No
Max Chained Enclosures : 16
Disable Ctrl-R : Yes
Enable Web BIOS : Yes
Direct PD Mapping : No
BIOS Enumerate VDs : Yes
Restore Hot Spare on Insertion : No
Expose Enclosure Devices : Yes
Maintain PD Fail History : Yes
Disable Puncturing : No
Zero Based Enclosure Enumeration : No
PreBoot CLI Enabled : Yes
LED Show Drive Activity : Yes
Cluster Disable : Yes
SAS Disable : No
Auto Detect BackPlane Enable : SGPIO/i2c SEP
Use FDE Only : No
Enable Led Header : No
Delay during POST : 0
EnableCrashDump : No
Disable Online Controller Reset : No
EnableLDBBM : No
Un-Certified Hard Disk Drives : Allow
Treat Single span R1E as R10 : No
Max LD per array : 16
Power Saving option : All power saving options are enabled
Default spin down time in minutes: 30
Enable JBOD : No
TTY Log In Flash : No
Auto Enhanced Import : No
BreakMirror RAID Support : No
Disable Join Mirror : No
Enable Shield State : No
Time taken to detect CME : 60s
Exit Code: 0x00
-LD信息
Adapter 0 -- Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Name :
RAID Level : Primary-5, Secondary-0, RAID Level Qualifier-3
Size : 50.934 TB
Parity Size : 3.637 TB
State : Optimal
Strip Size : 64 KB
Number Of Drives : 15
Span Depth : 1
Default Cache Policy: WriteThrough, ReadAdaptive, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteThrough, ReadAdaptive, Direct, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy : Disk's Default
Encryption Type : None
Is VD Cached: No
Number of Dedicated Hot Spares: 1
0 : EnclId - 245 SlotId - 15
Exit Code: 0x00
-PD列表
Enclosure Device ID: 245
Slot Number: 5
Drive's postion: DiskGroup: 0, Span: 0, Arm: 5
Enclosure position: 0
Device Id: 17
WWN: 5000CCA25DCF7521
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATA
Raw Size: 3.638 TB [0x1d1c0beb0 Sectors]
Non Coerced Size: 3.637 TB [0x1d1b0beb0 Sectors]
Coerced Size: 3.637 TB [0x1d1b00000 Sectors]
Firmware state: Online, Spun Up
Device Firmware Level: 1M02
Shield Counter: 0
Successful diagnostics completion on : N/A
SAS Address(0): 0x500062b20037c5cd
Connected Port Number: 0(path0)
Inquiry Data: K4H3047B WDC WD4002FYYZ-01B7CB0 01.01M02
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 6.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive Temperature : N/A
PI Eligibility: No
Drive is formatted for PI information: No
PI: No PI
Drive's write cache : Disabled
Drive's NCQ setting : Disabled
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Drive has flagged a S.M.A.R.T alert : No
Enclosure Device ID: 245
Slot Number: 12
Drive's postion: DiskGroup: 0, Span: 0, Arm: 12
Enclosure position: 0
Device Id: 21
WWN: 5000CCA244CAFCD1
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATA
Raw Size: 3.638 TB [0x1d1c0beb0 Sectors]
Non Coerced Size: 3.637 TB [0x1d1b0beb0 Sectors]
Coerced Size: 3.637 TB [0x1d1b00000 Sectors]
Firmware state: Online, Spun Up
Device Firmware Level: T907
Shield Counter: 0
Successful diagnostics completion on : N/A
SAS Address(0): 0x500062b20037c5d8
Connected Port Number: 0(path0)
Inquiry Data: N8GT59EY HGST HUS726040ALE610 APGNT907
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 6.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive Temperature : N/A
PI Eligibility: No
Drive is formatted for PI information: No
PI: No PI
Drive's write cache : Disabled
Drive's NCQ setting : Disabled
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Drive has flagged a S.M.A.R.T alert : No
Enclosure Device ID: 245
Slot Number: 15
Enclosure position: 0
Device Id: 27
WWN: 5000CCA25DCE7105
Sequence Number: 5
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATAHotspare Information:
Type: Dedicated, is revertible
Array #: 0
Raw Size: 3.638 TB [0x1d1c0beb0 Sectors]
Non Coerced Size: 3.637 TB [0x1d1b0beb0 Sectors]
Coerced Size: 3.637 TB [0x1d1b00000 Sectors]
Firmware state: Hotspare, Spun Up
Device Firmware Level: 1M02
Shield Counter: 0
Successful diagnostics completion on : N/A
SAS Address(0): 0x500062b20037c5db
Connected Port Number: 0(path0)
Inquiry Data: K4H0SV7B WDC WD4002FYYZ-01B7CB0 01.01M02Hotspare Information:
Type: Dedicated, is revertible
Array #: 0
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 6.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive Temperature : N/A
PI Eligibility: No
Drive is formatted for PI information: No
PI: No PI
Drive's write cache : Disabled
Drive's NCQ setting : Disabled
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Drive has flagged a S.M.A.R.T alert : No
Hotspare Information:
Type: Dedicated, is revertible
Array #: 0