IRST（英特尔快速存储技术，imsm）出现“ICRC、ABRT”错误：可能是软件吗？

2024-7-10 • tag-icon

IRST（英特尔快速存储技术，imsm）出现“ICRC、ABRT”错误：可能是软件吗？

我有一个带有两个 IRST RAID1 的系统：sda+ sdb(2TB)、sdc+ sdd(1TB)（在 Linux 中）

每对磁盘都是在一个订单中购买的，即它们是相同年龄的相同磁盘驱动器。

2TB RAID 包含操作系统（Windows、Linux）和各种数据分区，而 1TB RAID 包含一些非必要软件）。

1TB RAID 仅供 Windows 使用，而 2TB 分区则供两个操作系统使用。

现在我注意到（通过smartdLinux）sdc错误数量正在增加：

smartd[2008]: Device: /dev/sdc [SAT], ATA error count increased from 628 to 651

这是唯一一个错误数增加的。具体来说，磁盘 ( HGST HTS541010A9E680) 没有读取错误、没有待处理扇区和没有重定向扇区。磁盘还通过了长时间的自检。

更仔细地检查错误，它看起来像这样：

Device Error Count: 651 (device log contains only the most recent 4 errors)
...
Error 651 [2] occurred at disk power-on lifetime: 4947 hours (206 days + 3 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  84 -- 51 00 11 00 00 19 0e 07 8f 09 00  Error: ICRC, ABRT at LBA = 0x190e078f = 420349839

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  60 00 20 00 28 00 00 19 0e 0e 40 40 00     00:00:57.526  READ FPDMA QUEUED
  60 00 20 00 20 00 00 19 0e 0e 80 40 00     00:00:57.526  READ FPDMA QUEUED
  60 00 20 00 18 00 00 19 0e 0c c0 40 00     00:00:57.526  READ FPDMA QUEUED
  60 00 20 00 10 00 00 19 0e 0d 00 40 00     00:00:57.526  READ FPDMA QUEUED
  60 00 20 00 08 00 00 19 0e 0d 40 40 00     00:00:57.526  READ FPDMA QUEUED

另一个错误也发生在 LBA 420349839（并且记录的另外两个错误有不同的 LBA）。此外，导致错误的命令始终是READ FPDMA QUEUED。

在 Linux 中，传输统计信息看起来也不错（在udma6）：

SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x0001  2            0  Command failed due to ICRC error
0x0002  2            0  R_ERR response for data FIS
0x0003  2            0  R_ERR response for device-to-host data FIS
0x0004  2            0  R_ERR response for host-to-device data FIS
0x0005  2            0  R_ERR response for non-data FIS
0x0006  2            0  R_ERR response for device-to-host non-data FIS
0x0007  2            0  R_ERR response for host-to-device non-data FIS
0x0009  2            4  Transition from drive PhyRdy to drive PhyNRdy
0x000a  2            4  Device-to-host register FISes sent due to a COMRESET
0x000b  2            0  CRC errors within host-to-device FIS
0x000d  2            0  Non-CRC errors within host-to-device FIS

即使以最大速度读取块后，这些计数器也没有增加。最初我怀疑是电缆有问题或松动，或者是无线电干扰。

所以我想知道（因为许多文件由 Windows 从 1TB RAID 读取）：此错误是否可能是由于磁盘是 RAID1 的一部分、是英特尔芯片组 ( 8086:2822 (rev 05)) 或正在运行 Windows 10？此外，是否有方法将错误消息中的 LBA 映射到 RAID 上 NTFS 分区上的文件

RAID中的另一个磁盘正好有一个这样的错误：

SMART Extended Comprehensive Error Log Version: 1 (1 sectors)
Device Error Count: 1
...
Error 1 [0] occurred at disk power-on lifetime: 3163 hours (131 days + 19 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  84 -- 51 00 11 00 00 00 03 72 a7 00 00  Error: ICRC, ABRT at LBA = 0x000372a7 = 225959

  Commands leading to the command that caused the error were:
  CR FEATR COUNT  LBA_48  LH LM LL DV DC  Powered_Up_Time  Command/Feature_Name
  -- == -- == -- == == == -- -- -- -- --  ---------------  --------------------
  60 00 40 00 00 00 00 00 03 72 78 40 00     00:00:59.573  READ FPDMA QUEUED
  60 00 20 00 08 00 00 00 03 41 60 40 00     00:00:59.564  READ FPDMA QUEUED
  60 00 80 00 00 00 00 00 03 40 a8 40 00     00:00:59.563  READ FPDMA QUEUED
  60 00 70 00 00 00 00 00 03 1c d0 40 00     00:00:59.562  READ FPDMA QUEUED
  60 00 30 00 00 00 00 00 03 1c 88 40 00     00:00:59.562  READ FPDMA QUEUED

相关内容