硬盘故障状态诊断及应对策略

硬盘故障状态诊断及应对策略

经过 9 年多的工作(根据 SMART),我的 NAS 2TB WD 磁盘开始无法启动。我把它拿出来并连接到电脑上看看出了什么问题。

Ubuntu Disks 实用程序甚至无法执行基本的 SMART 测试。

我可能在发出裸命令时犯了一个错误fsck /dev/sdc4,因为它最终出现一条错误消息,告诉我它无法执行 I/O 操作。

系统重新启动使情况变得更糟 - 磁盘现在丢失了/dev/。所以我尝试过

回显“0 0 0”>/sys/class/scsi_host/host2/scan

并咨询了dmesg。它运行如下:

[ 3592.252845] ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 3592.253154] ata3.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)
[ 3602.797173] ata3: link is slow to respond, please be patient (ready=0)
[ 3607.476986] ata3: COMRESET failed (errno=-16)
[ 3612.828731] ata3: link is slow to respond, please be patient (ready=0)
[ 3613.008718] ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 3614.919737] ata3.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) succeeded
[ 3614.919743] ata3.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out
[ 3614.919748] ata3.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out
[ 3614.964104] ata3.00: ATA-8: WDC WD20EARX-00PASB0, 51.0AB51, max UDMA/133
[ 3614.964107] ata3.00: 3907029168 sectors, multi 0: LBA48 NCQ (depth 32), AA
[ 3614.971180] ata3.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) succeeded
[ 3614.971182] ata3.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out
[ 3614.971183] ata3.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out
[ 3614.978286] ata3.00: configured for UDMA/133
[ 3614.978378] scsi 2:0:0:0: Direct-Access     ATA      WDC WD20EARX-00P AB51 PQ: 0 ANSI: 5
[ 3614.978564] sd 2:0:0:0: [sdc] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB)
[ 3614.978566] sd 2:0:0:0: [sdc] 4096-byte physical blocks
[ 3614.978573] sd 2:0:0:0: [sdc] Write Protect is off
[ 3614.978575] sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00
[ 3614.978587] sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 3614.979412] sd 2:0:0:0: Attached scsi generic sg2 type 0
[ 3616.292596] ata3.00: exception Emask 0x50 SAct 0x400000 SErr 0x4090800 action 0xe frozen
[ 3616.292602] ata3.00: irq_stat 0x00400040, connection status changed
[ 3616.292606] ata3: SError: { HostInt PHYRdyChg 10B8B DevExch }
[ 3616.292610] ata3.00: failed command: READ FPDMA QUEUED
[ 3616.292618] ata3.00: cmd 60/08:b0:00:00:00/00:00:00:00:00/40 tag 22 ncq dma 4096 in
                       res 40/00:b4:00:00:00/00:00:00:00:00/40 Emask 0x50 (ATA bus error)
[ 3616.292621] ata3.00: status: { DRDY }
[ 3616.292626] ata3: hard resetting link
[ 3622.052321] ata3: link is slow to respond, please be patient (ready=0)
[ 3626.312084] ata3: COMRESET failed (errno=-16)
[ 3626.312094] ata3: hard resetting link
[ 3627.024061] ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 3632.259814] ata3.00: qc timeout (cmd 0xec)
[ 3632.259825] ata3.00: failed to IDENTIFY (I/O error, err_mask=0x5)
[ 3632.259828] ata3.00: revalidation failed (errno=-5)
[ 3632.259838] ata3: hard resetting link
[ 3634.911683] ata3: SATA link down (SStatus 0 SControl 300)
[ 3634.911690] ata3.00: link offline, clearing class 1 to NONE
[ 3634.913110] ata3: hard resetting link
[ 3635.703642] ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 3635.704007] ata3.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)
[ 3635.704011] ata3.00: revalidation failed (errno=-5)
[ 3635.704017] ata3.00: disabled
[ 3640.707392] ata3: hard resetting link
[ 3641.025705] ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 3641.026010] ata3.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)
[ 3646.083079] ata3: hard resetting link
[ 3646.397432] ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 3646.397723] ata3.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)
[ 3646.397728] ata3: limiting SATA link speed to 3.0 Gbps
[ 3651.458883] ata3: hard resetting link
[ 3651.774094] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[ 3651.774429] ata3.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)
[ 3656.834574] ata3: hard resetting link
[ 3657.149507] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[ 3657.149530] sd 2:0:0:0: [sdc] tag#22 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[ 3657.149534] sd 2:0:0:0: [sdc] tag#22 Sense Key : Illegal Request [current]
[ 3657.149538] sd 2:0:0:0: [sdc] tag#22 Add. Sense: Unaligned write command
[ 3657.149542] sd 2:0:0:0: [sdc] tag#22 CDB: Read(10) 28 00 00 00 00 00 00 00 08 00
[ 3657.149544] print_req_error: 2 callbacks suppressed
[ 3657.149547] blk_update_request: I/O error, dev sdc, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[ 3657.149553] buffer_io_error: 2 callbacks suppressed
[ 3657.149555] Buffer I/O error on dev sdc, logical block 0, async page read
[ 3657.149573] ata3: EH complete
[ 3657.149582] ata3.00: detaching (SCSI 2:0:0:0)
[ 3657.149629] blk_update_request: I/O error, dev sdc, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[ 3657.149635] Buffer I/O error on dev sdc, logical block 0, async page read
[ 3657.149691] blk_update_request: I/O error, dev sdc, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[ 3657.149696] Buffer I/O error on dev sdc, logical block 0, async page read
[ 3657.149711] ldm_validate_partition_table(): Disk read failed.
[ 3657.149737] blk_update_request: I/O error, dev sdc, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[ 3657.149741] Buffer I/O error on dev sdc, logical block 0, async page read
[ 3657.149777] blk_update_request: I/O error, dev sdc, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[ 3657.149780] Buffer I/O error on dev sdc, logical block 0, async page read
[ 3657.149812] blk_update_request: I/O error, dev sdc, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[ 3657.149816] Buffer I/O error on dev sdc, logical block 0, async page read
[ 3657.149847] blk_update_request: I/O error, dev sdc, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[ 3657.149851] Buffer I/O error on dev sdc, logical block 0, async page read
[ 3657.149865] Dev sdc: unable to read RDB block 0
[ 3657.149886] blk_update_request: I/O error, dev sdc, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[ 3657.149889] Buffer I/O error on dev sdc, logical block 0, async page read
[ 3657.149920] blk_update_request: I/O error, dev sdc, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[ 3657.149924] Buffer I/O error on dev sdc, logical block 0, async page read
[ 3657.149964] blk_update_request: I/O error, dev sdc, sector 24 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[ 3657.149968] Buffer I/O error on dev sdc, logical block 3, async page read
[ 3657.150033]  sdc: unable to read partition table
[ 3657.166693] sd 2:0:0:0: [sdc] Read Capacity(16) failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[ 3657.166695] sd 2:0:0:0: [sdc] Sense not available.
[ 3657.166718] sd 2:0:0:0: [sdc] Read Capacity(10) failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[ 3657.166719] sd 2:0:0:0: [sdc] Sense not available.
[ 3657.166736] sd 2:0:0:0: [sdc] 0 512-byte logical blocks: (0 B/0 B)
[ 3657.166737] sd 2:0:0:0: [sdc] 4096-byte physical blocks
[ 3657.166769] sd 2:0:0:0: [sdc] Attached SCSI disk
[ 3657.167107] sd 2:0:0:0: [sdc] Stopping disk
[ 3657.167124] sd 2:0:0:0: [sdc] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK

由此我了解到硬盘有坏扇区,导致无法正确连接,更不用说识别这 3 个现有分区(EXT3 或 4,我忘了)。

所以我现在想决定进一步的策略。

首先,我想/dev/再次查看该硬盘,但不知道是否可能。接下来我要处理那些坏块。或者我应该首先使用dd(或其他工具)将数据复制到安全的地方?

相关内容