我如何确定哪个驱动器出现故障?

我如何确定哪个驱动器出现故障?

我如何将此信息与发生故障的物理驱动器关联起来?这是一个 Debian 内核。

Nov 21 18:06:00 IHPAC kernel: [594026.608042] ata5.00: status: { DRDY }
Nov 21 18:06:00 IHPAC kernel: [594026.787427] ata5.00: failed command: WRITE FPDMA QUEUED
Nov 21 18:06:00 IHPAC kernel: [594026.966505] ata5.00: cmd 61/00:e8:fb:b6:59/04:00:a2:00:00/40 tag 29 ncq 524288 out
Nov 21 18:06:00 IHPAC kernel: [594026.966508]          res 40/00:48:03:ef:59/00:00:a2:00:00/40 Emask 0x50 (ATA bus error)


IHPAC:~$ dmesg | grep ata5
[    6.291403] ata5: SATA max UDMA/133 abar m1024@0xfaffe400 port 0xfaffe600 irq 22
[    6.840145] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[    6.840829] ata5.00: ATA-8: ST3000DM001-9YN166, CC9C, max UDMA/133
[    6.840832] ata5.00: 5860533168 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
[    6.841483] ata5.00: configured for UDMA/133
[59669.062886] ata5: exception Emask 0x10 SAct 0x0 SErr 0x90202 action 0xe frozen
[59669.066958] ata5: irq_stat 0x00400000, PHY RDY changed
[59669.069852] ata5: SError: { RecovComm Persist PHYRdyChg 10B8B }
[59669.073247] ata5: hard resetting link
[59675.560102] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[59675.561576] ata5.00: configured for UDMA/133
[59675.561589] ata5: EH complete
...
[421238.151794] ata5: exception Emask 0x10 SAct 0x0 SErr 0x90202 action 0xe frozen   \
[421238.155912] ata5: irq_stat 0x00400000, PHY RDY changed                            \
[421238.158854] ata5: SError: { RecovComm Persist PHYRdyChg 10B8B }                    |
[421238.162302] ata5: hard resetting link                                              | Repeats 5 times
[421244.650101] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)                 |
[421244.651513] ata5.00: configured for UDMA/133                                      /
[421244.651525] ata5: EH complete                                                    /
...
[593676.000793] ata5.00: exception Emask 0x50 SAct 0x7fffffff SErr 0x90a02 action 0xe frozen
[593676.130479] ata5.00: irq_stat 0x00400000, PHY RDY changed
[593676.259877] ata5: SError: { RecovComm Persist HostInt PHYRdyChg 10B8B }
[593676.388864] ata5.00: failed command: WRITE FPDMA QUEUED                             \
[593676.513825] ata5.00: cmd 61/e0:00:ab:ac:30/01:00:9d:00:00/40 tag 0 ncq 245760 out    | Repeats MANY times
[593676.750610] ata5.00: status: { DRDY }                                               /
...
[593697.436610] ata5: hard resetting link
[593698.380128] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[593698.382682] ata5.00: configured for UDMA/133
[593698.382883] ata5: EH complete
[594005.248408] ata5.00: exception Emask 0x50 SAct 0x7fffffff SErr 0x90a02 action 0xe frozen
[594005.429802] ata5.00: irq_stat 0x00400000, PHY RDY changed
[594005.610614] ata5: SError: { RecovComm Persist HostInt PHYRdyChg 10B8B }
[594005.791306] ata5.00: failed command: WRITE FPDMA QUEUED                            \
[594005.972202] ata5.00: cmd 61/00:00:fb:8a:59/04:00:a2:00:00/40 tag 0 ncq 524288 out   | Repeats MANY times
[594006.337349] ata5.00: status: { DRDY }                                              /
...
[594028.228309] ata5: hard resetting link
[594029.170095] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[594029.173073] ata5.00: configured for UDMA/133
[594029.173367] ata5: EH complete

答案1

在网上搜索过你并发现这个脚本第二个答案

for x in /sys/block/sd*
do
dev=$(basename $x)
host=$(ls -l $x | egrep -o "host[0-9]+")
target=$(ls -l $x | egrep -o "target[0-9:]*")
a=$(cat /sys/class/scsi_host/$host/unique_id)
a2=$(echo $target | egrep -o "[0-9]:[0-9]$" | sed 's/://')
serial=$(hdparm -I /dev/$dev | grep "Serial Number" | sed 's/^[ \t]*//')
echo -e "$dev \t ata$a.$a2 \t $serial"
done

答案2

Smartctl 将向您显示现有驱动器的序列号。dmesg 还应包含其他磁盘的序列号。

如果您只使用单个 sATA 控制器(而不是 sATA 板加板载控制器),那么 ata* 通常会映射到该端口。

整个 dmesg 是最好的,但我认为它是主板上的连接器 5。

相关内容