我的一台服务器上配置了 4 磁盘 RAID 10(作为 2 个逻辑阵列),然后发现服务器宕机了。当我重新启动服务器时,发现其中一个磁盘丢失了。
与数据中心沟通后,他们更换了故障磁盘,因为它已经完全坏了。我希望 RAID 卡能够接受磁盘并自动重建阵列,但没有发生。检查了自动故障转移功能是否已打开(确实已打开),并尝试初始化磁盘,但仍然没有成功。
我不确定我是否做错了什么,所以需要一些建议。流程是什么,我如何检查问题出在 RAID 阵列还是磁盘上。现在我无法使用 Adaptec Card Utility 或 arrconf 工具重建阵列。
前
root@rescue ~ # arcconf GETCONFIG 1 LD
Controllers found: 1
----------------------------------------------------------------------
Logical device information
----------------------------------------------------------------------
Logical device number 0
Logical device name : ESXi
RAID level : 10
Status of logical device : Failed
Size : 1494016 MB
Stripe-unit size : 256 KB
Read-cache mode : Enabled
MaxCache preferred read cache setting : Disabled
MaxCache read cache setting : Disabled
Write-cache mode : Disabled (write-through)
Write-cache setting : Disabled (write-through)
Partitioned : Unknown
Protected by Hot-Spare : No
Bootable : Yes
Failed stripes : No
Power settings : Disabled
--------------------------------------------------------
Logical device segment information
--------------------------------------------------------
Group 0, Segment 0 : Missing
Group 0, Segment 1 : Present (Controller:1,Connector:0,Device:0) 9VS4DAWW
Group 1, Segment 0 : Present (Controller:1,Connector:0,Device:2) 9VS4C646
Group 1, Segment 1 : Present (Controller:1,Connector:0,Device:3) 9VS4C6Z6
Logical device number 1
Logical device name :
RAID level : 10
Status of logical device : Failed
Size : 1362942 MB
Stripe-unit size : 256 KB
Read-cache mode : Enabled
MaxCache preferred read cache setting : Disabled
MaxCache read cache setting : Disabled
Write-cache mode : Disabled (write-through)
Write-cache setting : Disabled (write-through)
Partitioned : Unknown
Protected by Hot-Spare : No
Bootable : No
Failed stripes : No
Power settings : Disabled
--------------------------------------------------------
Logical device segment information
--------------------------------------------------------
Group 0, Segment 0 : Missing
Group 0, Segment 1 : Present (Controller:1,Connector:0,Device:0) 9VS4DAWW
Group 1, Segment 0 : Present (Controller:1,Connector:0,Device:2) 9VS4C646
Group 1, Segment 1 : Present (Controller:1,Connector:0,Device:3) 9VS4C6Z6
root@rescue ~ # arcconf GETCONFIG 1 PD
Controllers found: 1
----------------------------------------------------------------------
Physical Device information
----------------------------------------------------------------------
Device #0
Device is a Hard drive
State : Online
Supported : Yes
Transfer Speed : SATA 3.0 Gb/s
Reported Channel,Device(T:L) : 0,0(0:0)
Reported Location : Connector 0, Device 0
Vendor :
Model : ST31500341AS
Firmware : CC1H
Serial number : 9VS4DAWW
Size : 1430799 MB
Write Cache : Enabled (write-back)
FRU : None
S.M.A.R.T. : No
S.M.A.R.T. warnings : 0
Power State : Full rpm
Supported Power States : Full rpm,Powered off
SSD : No
MaxCache Capable : No
MaxCache Assigned : No
NCQ status : Enabled
Device #1
Device is a Hard drive
State : Online
Supported : Yes
Transfer Speed : SATA 3.0 Gb/s
Reported Channel,Device(T:L) : 0,2(2:0)
Reported Location : Connector 0, Device 2
Vendor :
Model : ST31500341AS
Firmware : CC1H
Serial number : 9VS4C646
Size : 1430799 MB
Write Cache : Enabled (write-back)
FRU : None
S.M.A.R.T. : No
S.M.A.R.T. warnings : 0
Power State : Full rpm
Supported Power States : Full rpm,Powered off
SSD : No
MaxCache Capable : No
MaxCache Assigned : No
NCQ status : Enabled
Device #2
Device is a Hard drive
State : Online
Supported : Yes
Transfer Speed : SATA 3.0 Gb/s
Reported Channel,Device(T:L) : 0,3(3:0)
Reported Location : Connector 0, Device 3
Vendor :
Model : ST31500341AS
Firmware : CC1H
Serial number : 9VS4C6Z6
Size : 1430799 MB
Write Cache : Enabled (write-back)
FRU : None
S.M.A.R.T. : No
S.M.A.R.T. warnings : 0
Power State : Full rpm
Supported Power States : Full rpm,Powered off
SSD : No
MaxCache Capable : No
MaxCache Assigned : No
NCQ status : Enabled
之后(已修复,之前粘贴了错误的)
root@rescue ~ # arcconf GETCONFIG 1 LD
Controllers found: 1
----------------------------------------------------------------------
Logical device information
----------------------------------------------------------------------
Logical device number 0
Logical device name : ESXi
RAID level : 10
Status of logical device : Failed
Size : 1494016 MB
Stripe-unit size : 256 KB
Read-cache mode : Enabled
MaxCache preferred read cache setting : Disabled
MaxCache read cache setting : Disabled
Write-cache mode : Disabled (write-through)
Write-cache setting : Disabled (write-through)
Partitioned : Unknown
Protected by Hot-Spare : No
Bootable : Yes
Failed stripes : No
Power settings : Disabled
--------------------------------------------------------
Logical device segment information
--------------------------------------------------------
Group 0, Segment 0 : Missing
Group 0, Segment 1 : Present (Controller:1,Connector:0,Device:0) 9VS4DAWW
Group 1, Segment 0 : Present (Controller:1,Connector:0,Device:2) 9VS4C646
Group 1, Segment 1 : Present (Controller:1,Connector:0,Device:3) 9VS4C6Z6
Logical device number 1
Logical device name :
RAID level : 10
Status of logical device : Failed
Size : 1362942 MB
Stripe-unit size : 256 KB
Read-cache mode : Enabled
MaxCache preferred read cache setting : Disabled
MaxCache read cache setting : Disabled
Write-cache mode : Disabled (write-through)
Write-cache setting : Disabled (write-through)
Partitioned : Unknown
Protected by Hot-Spare : No
Bootable : No
Failed stripes : No
Power settings : Disabled
--------------------------------------------------------
Logical device segment information
--------------------------------------------------------
Group 0, Segment 0 : Missing
Group 0, Segment 1 : Present (Controller:1,Connector:0,Device:0) 9VS4DAWW
Group 1, Segment 0 : Present (Controller:1,Connector:0,Device:2) 9VS4C646
Group 1, Segment 1 : Present (Controller:1,Connector:0,Device:3) 9VS4C6Z6
Command completed successfully.
root@rescue ~ # arcconf GETCONFIG 1 PD
Controllers found: 1
----------------------------------------------------------------------
Physical Device information
----------------------------------------------------------------------
Device #0
Device is a Hard drive
State : Online
Supported : Yes
Transfer Speed : SATA 3.0 Gb/s
Reported Channel,Device(T:L) : 0,0(0:0)
Reported Location : Connector 0, Device 0
Vendor :
Model : ST31500341AS
Firmware : CC1H
Serial number : 9VS4DAWW
Size : 1430799 MB
Write Cache : Enabled (write-back)
FRU : None
S.M.A.R.T. : Yes
S.M.A.R.T. warnings : 3
Power State : Full rpm
Supported Power States : Full rpm,Powered off
SSD : No
MaxCache Capable : No
MaxCache Assigned : No
NCQ status : Enabled
Device #1
Device is a Hard drive
State : Ready
Supported : Yes
Transfer Speed : SATA 3.0 Gb/s
Reported Channel,Device(T:L) : 0,1(1:0)
Reported Location : Connector 0, Device 1
Vendor :
Model : SAMSUNG HD154UI
Firmware : 1AG01118
Serial number : S1Y6J90B202833
Size : 1430799 MB
Write Cache : Enabled (write-back)
FRU : None
S.M.A.R.T. : No
S.M.A.R.T. warnings : 0
Power State : Full rpm
Supported Power States : Full rpm,Powered off,Reduced rpm
SSD : No
MaxCache Capable : No
MaxCache Assigned : No
NCQ status : Enabled
Device #2
Device is a Hard drive
State : Online
Supported : Yes
Transfer Speed : SATA 3.0 Gb/s
Reported Channel,Device(T:L) : 0,2(2:0)
Reported Location : Connector 0, Device 2
Vendor :
Model : ST31500341AS
Firmware : CC1H
Serial number : 9VS4C646
Size : 1430799 MB
Write Cache : Enabled (write-back)
FRU : None
S.M.A.R.T. : No
S.M.A.R.T. warnings : 0
Power State : Full rpm
Supported Power States : Full rpm,Powered off
SSD : No
MaxCache Capable : No
MaxCache Assigned : No
NCQ status : Enabled
Device #3
Device is a Hard drive
State : Online
Supported : Yes
Transfer Speed : SATA 3.0 Gb/s
Reported Channel,Device(T:L) : 0,3(3:0)
Reported Location : Connector 0, Device 3
Vendor :
Model : ST31500341AS
Firmware : CC1H
Serial number : 9VS4C6Z6
Size : 1430799 MB
Write Cache : Enabled (write-back)
FRU : None
S.M.A.R.T. : No
S.M.A.R.T. warnings : 0
Power State : Full rpm
Supported Power States : Full rpm,Powered off
SSD : No
MaxCache Capable : No
MaxCache Assigned : No
NCQ status : Enabled
Command completed successfully.
root@rescue ~ #
答案1
对 adaptec 不是特别熟悉,但对于大多数 RAID 控制器来说,在其中一个活动驱动器发生故障后,只有指定为热备用的磁盘才会自动用于重建阵列。
用新磁盘替换故障磁盘通常不会自动触发阵列重建。这需要管理员输入。
快速浏览手册表示你需要做类似的事情:
------- Controller #
| ----- Channel # : from reported location
| | --- Device # : from reported location
| | | - set status : RBL for rebuild
| | | |
HRCONF SETSTATE 1 0 1 RBL