我正在尝试诊断为什么我的 20 个硬盘的 raid6 阵列在我的 ubuntu 安装上停止工作。我注意到,当我尝试强制启动它时,几个驱动器根本就不再列在 raid 中,并且根本不再作为我的计算机上的驱动器。
我以为是硬件问题,于是打开机箱来摆弄 SATA 线。我停止了突袭,然后拔掉并重新插入了一些电线,并想强制计算机检查新的SATA设备。我在这个网站上搜索并找到了一个答案,告诉我执行此命令来刷新SATA设备:
for host in /sys/class/scsi_host/*; do echo "- - -" | sudo tee $host/scan; ls /dev/sd* ; done
命令的输出显示驱动器消失。 (为了解释,驱动器 sdb-sdu 是 raid6 阵列的一部分,该命令使其中 4 个——sdo、sdp、sdq 和 sdr——消失)
- - -
/dev/sda /dev/sda2 /dev/sdc /dev/sde /dev/sdg /dev/sdi /dev/sdk /dev/sdm /dev/sdo /dev/sdq /dev/sds /dev/sdu
/dev/sda1 /dev/sdb /dev/sdd /dev/sdf /dev/sdh /dev/sdj /dev/sdl /dev/sdn /dev/sdp /dev/sdr /dev/sdt
- - -
/dev/sda /dev/sda2 /dev/sdc /dev/sde /dev/sdg /dev/sdi /dev/sdk /dev/sdm /dev/sdo /dev/sdq /dev/sds /dev/sdu
/dev/sda1 /dev/sdb /dev/sdd /dev/sdf /dev/sdh /dev/sdj /dev/sdl /dev/sdn /dev/sdp /dev/sdr /dev/sdt
- - -
/dev/sda /dev/sda2 /dev/sdc /dev/sde /dev/sdg /dev/sdi /dev/sdk /dev/sdm /dev/sdp /dev/sdr /dev/sdt
/dev/sda1 /dev/sdb /dev/sdd /dev/sdf /dev/sdh /dev/sdj /dev/sdl /dev/sdn /dev/sdq /dev/sds /dev/sdu
- - -
/dev/sda /dev/sda2 /dev/sdc /dev/sde /dev/sdg /dev/sdi /dev/sdk /dev/sdm /dev/sds /dev/sdu
/dev/sda1 /dev/sdb /dev/sdd /dev/sdf /dev/sdh /dev/sdj /dev/sdl /dev/sdn /dev/sdt
- - -
/dev/sda /dev/sda2 /dev/sdc /dev/sde /dev/sdg /dev/sdi /dev/sdk /dev/sdm /dev/sds /dev/sdu
/dev/sda1 /dev/sdb /dev/sdd /dev/sdf /dev/sdh /dev/sdj /dev/sdl /dev/sdn /dev/sdt
- - -
/dev/sda /dev/sda2 /dev/sdc /dev/sde /dev/sdg /dev/sdi /dev/sdk /dev/sdm /dev/sds /dev/sdu
/dev/sda1 /dev/sdb /dev/sdd /dev/sdf /dev/sdh /dev/sdj /dev/sdl /dev/sdn /dev/sdt
- - -
/dev/sda /dev/sda2 /dev/sdc /dev/sde /dev/sdg /dev/sdi /dev/sdk /dev/sdm /dev/sds /dev/sdu
/dev/sda1 /dev/sdb /dev/sdd /dev/sdf /dev/sdh /dev/sdj /dev/sdl /dev/sdn /dev/sdt
- - -
/dev/sda /dev/sda2 /dev/sdc /dev/sde /dev/sdg /dev/sdi /dev/sdk /dev/sdm /dev/sds /dev/sdu
/dev/sda1 /dev/sdb /dev/sdd /dev/sdf /dev/sdh /dev/sdj /dev/sdl /dev/sdn /dev/sdt
- - -
/dev/sda /dev/sda2 /dev/sdc /dev/sde /dev/sdg /dev/sdi /dev/sdk /dev/sdm /dev/sds /dev/sdu
/dev/sda1 /dev/sdb /dev/sdd /dev/sdf /dev/sdh /dev/sdj /dev/sdl /dev/sdn /dev/sdt
- - -
/dev/sda /dev/sda2 /dev/sdc /dev/sde /dev/sdg /dev/sdi /dev/sdk /dev/sdm /dev/sds /dev/sdu
/dev/sda1 /dev/sdb /dev/sdd /dev/sdf /dev/sdh /dev/sdj /dev/sdl /dev/sdn /dev/sdt
- - -
/dev/sda /dev/sda2 /dev/sdc /dev/sde /dev/sdg /dev/sdi /dev/sdk /dev/sdm /dev/sds /dev/sdu
/dev/sda1 /dev/sdb /dev/sdd /dev/sdf /dev/sdh /dev/sdj /dev/sdl /dev/sdn /dev/sdt
- - -
/dev/sda /dev/sda2 /dev/sdc /dev/sde /dev/sdg /dev/sdi /dev/sdk /dev/sdm /dev/sds /dev/sdu
/dev/sda1 /dev/sdb /dev/sdd /dev/sdf /dev/sdh /dev/sdj /dev/sdl /dev/sdn /dev/sdt
- - -
/dev/sda /dev/sda2 /dev/sdc /dev/sde /dev/sdg /dev/sdi /dev/sdk /dev/sdm /dev/sds /dev/sdu
/dev/sda1 /dev/sdb /dev/sdd /dev/sdf /dev/sdh /dev/sdj /dev/sdl /dev/sdn /dev/sdt
- - -
/dev/sda /dev/sda2 /dev/sdc /dev/sde /dev/sdg /dev/sdi /dev/sdk /dev/sdm /dev/sds /dev/sdu
/dev/sda1 /dev/sdb /dev/sdd /dev/sdf /dev/sdh /dev/sdj /dev/sdl /dev/sdn /dev/sdt
- - -
/dev/sda /dev/sda2 /dev/sdc /dev/sde /dev/sdg /dev/sdi /dev/sdk /dev/sdm /dev/sds /dev/sdu
/dev/sda1 /dev/sdb /dev/sdd /dev/sdf /dev/sdh /dev/sdj /dev/sdl /dev/sdn /dev/sdt
- - -
/dev/sda /dev/sda2 /dev/sdc /dev/sde /dev/sdg /dev/sdi /dev/sdk /dev/sdm /dev/sds /dev/sdu
/dev/sda1 /dev/sdb /dev/sdd /dev/sdf /dev/sdh /dev/sdj /dev/sdl /dev/sdn /dev/sdt
- - -
/dev/sda /dev/sda2 /dev/sdc /dev/sde /dev/sdg /dev/sdi /dev/sdk /dev/sdm /dev/sds /dev/sdu
/dev/sda1 /dev/sdb /dev/sdd /dev/sdf /dev/sdh /dev/sdj /dev/sdl /dev/sdn /dev/sdt
- - -
/dev/sda /dev/sda2 /dev/sdc /dev/sde /dev/sdg /dev/sdi /dev/sdk /dev/sdm /dev/sds /dev/sdu
/dev/sda1 /dev/sdb /dev/sdd /dev/sdf /dev/sdh /dev/sdj /dev/sdl /dev/sdn /dev/sdt
- - -
/dev/sda /dev/sda2 /dev/sdc /dev/sde /dev/sdg /dev/sdi /dev/sdk /dev/sdm /dev/sds /dev/sdu
/dev/sda1 /dev/sdb /dev/sdd /dev/sdf /dev/sdh /dev/sdj /dev/sdl /dev/sdn /dev/sdt
我通过拔掉 SATA 线来物理检查消失的驱动器的连接,看看是否有更多驱动器消失。通过这个排除过程,我发现消失的 4 个驱动器都连接到了我已经使用了至少 4 年没有问题的 8 端口 PCIE 转 SATA 卡。
我重新启动了系统几次,还注意到启动时丢失了几个硬盘,但在我打开 Ubuntu Disks 实用程序后又出现了,并伴有蜂鸣声。然后当我再次发出上面的命令时,4个驱动器又消失了。
raid6 阵列当前处于混乱状态,某些驱动器报告连接驱动器数量和上次更改时间的不同值。我已经尝试强制它组装两次并且它起作用了,并开始重建,但随后 2 个或更多驱动器将从操作系统中完全消失,并且重建将冻结。
是卡吗?我应该更换它吗?
以下是当我停止 raid 阵列然后发出下一个命令来检查 sata 设备时 dmesg 中的内容:
[ 65.849212] md: md0 stopped.
[ 76.462011] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 76.463189] ata1.00: configured for UDMA/133
[ 76.791618] ahci 0000:06:00.0: FBS is disabled
[ 76.951538] ahci 0000:06:00.0: FBS is enabled
[ 76.952111] ata10.00: SATA link up 3.0 Gbps (SStatus 123 SControl 330)
[ 77.267083] ata10.01: SATA link down (SStatus 610 SControl 330)
[ 77.581891] ata10.02: SATA link down (SStatus 610 SControl 330)
[ 77.894936] ata10.03: SATA link down (SStatus 610 SControl 330)
[ 78.211968] ata10.04: SATA link down (SStatus 610 SControl 330)
[ 78.215129] ata10.00: configured for UDMA/133
[ 82.816960] ata10.01: SATA link down (SStatus 610 SControl 330)
[ 83.128896] ata10.02: SATA link down (SStatus 610 SControl 330)
[ 83.440843] ata10.03: SATA link down (SStatus 610 SControl 330)
[ 83.752802] ata10.04: SATA link down (SStatus 610 SControl 330)
[ 88.192093] ata10.01: SATA link down (SStatus 610 SControl 330)
[ 88.504051] ata10.02: SATA link down (SStatus 610 SControl 330)
[ 88.815986] ata10.03: SATA link down (SStatus 610 SControl 330)
[ 89.127946] ata10.04: SATA link down (SStatus 610 SControl 330)
[ 89.127970] ata10.01: disabled
[ 89.127988] ata10.02: disabled
[ 89.128003] ata10.03: disabled
[ 89.128018] ata10.04: disabled
[ 89.128332] ata10.01: detaching (SCSI 10:1:0:0)
[ 89.129036] sd 10:1:0:0: [sdo] Synchronizing SCSI cache
[ 89.129059] sd 10:1:0:0: [sdo] Synchronize Cache(10) failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[ 89.129060] sd 10:1:0:0: [sdo] Stopping disk
[ 89.129066] sd 10:1:0:0: [sdo] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[ 89.185725] ata10.02: detaching (SCSI 10:2:0:0)
[ 89.186427] sd 10:2:0:0: [sdp] Synchronizing SCSI cache
[ 89.186454] sd 10:2:0:0: [sdp] Synchronize Cache(10) failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[ 89.186455] sd 10:2:0:0: [sdp] Stopping disk
[ 89.186461] sd 10:2:0:0: [sdp] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[ 89.253718] ata10.03: detaching (SCSI 10:3:0:0)
[ 89.254504] sd 10:3:0:0: [sdq] Synchronizing SCSI cache
[ 89.254528] sd 10:3:0:0: [sdq] Synchronize Cache(10) failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[ 89.254529] sd 10:3:0:0: [sdq] Stopping disk
[ 89.254535] sd 10:3:0:0: [sdq] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[ 89.317679] ata10.04: detaching (SCSI 10:4:0:0)
[ 89.318503] sd 10:4:0:0: [sdr] Synchronizing SCSI cache
[ 89.318527] sd 10:4:0:0: [sdr] Synchronize Cache(10) failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[ 89.318528] sd 10:4:0:0: [sdr] Stopping disk
[ 89.318534] sd 10:4:0:0: [sdr] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[ 89.460199] ata11: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 89.462818] ata11.00: configured for UDMA/133
[ 89.796131] ata12: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 89.798858] ata12.00: configured for UDMA/133
[ 90.132065] ata13: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 90.133201] ata13.00: NCQ Send/Recv Log not supported
[ 90.134221] ata13.00: NCQ Send/Recv Log not supported
[ 90.134227] ata13.00: configured for UDMA/133
[ 90.472068] ata14: SATA link down (SStatus 0 SControl 300)
[ 90.808247] ata15: SATA link down (SStatus 0 SControl 300)
[ 91.147895] ata16: SATA link down (SStatus 0 SControl 300)
[ 91.484079] ata17: SATA link down (SStatus 0 SControl 300)
[ 91.824122] ata18: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 91.824750] ata18.00: configured for UDMA/66
[ 92.155995] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 92.157411] ata2.00: configured for UDMA/133
[ 92.491879] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 92.494534] ata3.00: configured for UDMA/133
[ 92.827218] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 92.828045] ata4.00: supports DRM functions and may not be fully accessible
[ 92.830499] ata4.00: supports DRM functions and may not be fully accessible
[ 92.832149] ata4.00: configured for UDMA/133
[ 93.847192] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 93.849644] ata5.00: configured for UDMA/133
[ 94.175633] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 94.177221] ata6.00: configured for UDMA/133
[ 94.664704] ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 94.667911] ata7.00: configured for UDMA/133
[ 95.152651] ata8: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 96.732470] ata8.00: configured for UDMA/133
[ 97.236448] ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 97.237568] ata9.00: configured for UDMA/133