设备(SATA 驱动器)上的缓冲区 I/O 错误

设备(SATA 驱动器)上的缓冲区 I/O 错误

我不断在服务器的一个内核日志(负责文件操作)中收到这些消息。我想知道是否有人知道这些问题有多严重。我无法使用 smartmontools,因为磁盘由 3ware 卡处理,它有自己的(tw_cli 实用程序非常有限)。

[2522065.275739] sd 0:0:1:0: [sdg] CDB: 
[2522065.275741] Read(10): 28 00 2e 90 97 f8 00 00 08 00
[2522065.275750] end_request: I/O error, dev sdg, sector 781228024
[2522065.281091] Buffer I/O error on device sdg, logical block 97653503
[2522065.287157] sd 0:0:1:0: [sdg] Device not ready
[2522065.287163] sd 0:0:1:0: [sdg]  
[2522065.287166] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[2522065.287168] sd 0:0:1:0: [sdg]  
[2522065.287170] Sense Key : Not Ready [current] 
[2522065.287174] sd 0:0:1:0: [sdg]  
[2522065.287176] Add. Sense: Logical unit not ready, cause not reportable
[2522065.287179] sd 0:0:1:0: [sdg] CDB: 
[2522065.287181] Read(10): 28 00 00 00 00 00 00 00 20 00
[2522065.287190] end_request: I/O error, dev sdg, sector 0
[2522065.291147] Buffer I/O error on device sdg, logical block 0
[2522065.291147] Buffer I/O error on device sdg, logical block 1
[2522065.291147] Buffer I/O error on device sdg, logical block 2
[2522065.308465] sd 0:0:1:0: [sdg] Device not ready
[2522065.308465] sd 0:0:1:0: [sdg]  
[2522065.308465] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[2522065.308465] sd 0:0:1:0: [sdg]  
[2522065.308465] Sense Key : Not Ready [current] 
[2522065.308465] sd 0:0:1:0: [sdg]  
[2522065.308465] Add. Sense: Logical unit not ready, cause not reportable
[2522065.308465] sd 0:0:1:0: [sdg] CDB: 
[2522065.308465] Read(10): 28 00 00 00 00 00 00 00 08 00
[2522065.308465] end_request: I/O error, dev sdg, sector 0

谢谢!

答案1

您可以使用智能值:例如:

 smartctl -a -d 3ware,2 /dev/twe0

引用 smartctl 的手册页:

Under Linux and FreeBSD, to look at ATA disks behind 3ware SCSI RAID controllers, use syntax such as:
          smartctl -a -d 3ware,2 /dev/sda
          smartctl -a -d 3ware,0 /dev/twe0
          smartctl -a -d 3ware,1 /dev/twa0
          where in the argument 3ware,N, the integer N is the disk number (3ware ´port´) within the 3ware ATA RAID controller.  The allowed values of N are from 0  to  31  inclusive.   The  first  two
          forms,  which  refer to devices /dev/sda-z and /dev/twe0-15, may be used with 3ware series 6000, 7000, and 8000 series controllers that use the 3x-xxxx driver.  Note that the /dev/sda-z form
          is deprecated starting with the Linux 2.6 kernel series and may not be supported by the Linux kernel in the near future. The final form, which refers to devices /dev/twa0-15,  must  be  used
          with 3ware 9000 series controllers, which use the 3w-9xxx driver.

          Note  that  if the special character device nodes /dev/twa? and /dev/twe? do not exist, or exist with the incorrect major or minor numbers, smartctl will recreate them on the fly.  Typically
          /dev/twa0 refers to the first 9000-series controller, /dev/twa1 refers to the second 9000 series controller, and so on. Likewise /dev/twe0 refers to  the  first  6/7/8000-series  controller,
          /dev/twa1 refers to the second 6/7/8000 series controller, and so on.

关于你的问题的一些想法:

这可能还不是 HDD/SSD 的全面问题,但我建议尽快更换它。

如果还没有备份,请立即备份!

您可以使用以下方法检查问题:

e2fsck -fv /dev/sdX

如果您看到智能内重新分配的扇区,我认为您应该更换驱动器。

答案2

您可以使用以下方式访问智能信息(例如):

smartctl -a -d 3ware,N /dev/twa0

N 是端口号,twa0 是控制器。

通过下面的方法你可以得到一些接口错误统计数据:

smartctl -l sataphy -d 3ware,N /dev/twa0

使用该命令,我能够确定日志中不断出现的“ata 异常”是接口/电缆错误的结果,因为 CRC 计数增加了(最终需要用不同类型的磁盘替换磁盘。用相同类型的主板替换主板没有帮助)。从中可以看出,普通的 SATA 控制器比 3Ware 端口提供更多信息。

至于“缓冲区错误”,我从未遇到过,所以我无法推测。我过去曾遇到过多次“ata 异常”(在软件 RAID 上),这几乎总是故障的前兆。因此,我现在扫描我的日志以查找该错误。

相关内容