我不断在服务器的一个内核日志(负责文件操作)中收到这些消息。我想知道是否有人知道这些问题有多严重。我无法使用 smartmontools,因为磁盘由 3ware 卡处理,它有自己的(tw_cli 实用程序非常有限)。
[2522065.275739] sd 0:0:1:0: [sdg] CDB:
[2522065.275741] Read(10): 28 00 2e 90 97 f8 00 00 08 00
[2522065.275750] end_request: I/O error, dev sdg, sector 781228024
[2522065.281091] Buffer I/O error on device sdg, logical block 97653503
[2522065.287157] sd 0:0:1:0: [sdg] Device not ready
[2522065.287163] sd 0:0:1:0: [sdg]
[2522065.287166] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[2522065.287168] sd 0:0:1:0: [sdg]
[2522065.287170] Sense Key : Not Ready [current]
[2522065.287174] sd 0:0:1:0: [sdg]
[2522065.287176] Add. Sense: Logical unit not ready, cause not reportable
[2522065.287179] sd 0:0:1:0: [sdg] CDB:
[2522065.287181] Read(10): 28 00 00 00 00 00 00 00 20 00
[2522065.287190] end_request: I/O error, dev sdg, sector 0
[2522065.291147] Buffer I/O error on device sdg, logical block 0
[2522065.291147] Buffer I/O error on device sdg, logical block 1
[2522065.291147] Buffer I/O error on device sdg, logical block 2
[2522065.308465] sd 0:0:1:0: [sdg] Device not ready
[2522065.308465] sd 0:0:1:0: [sdg]
[2522065.308465] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[2522065.308465] sd 0:0:1:0: [sdg]
[2522065.308465] Sense Key : Not Ready [current]
[2522065.308465] sd 0:0:1:0: [sdg]
[2522065.308465] Add. Sense: Logical unit not ready, cause not reportable
[2522065.308465] sd 0:0:1:0: [sdg] CDB:
[2522065.308465] Read(10): 28 00 00 00 00 00 00 00 08 00
[2522065.308465] end_request: I/O error, dev sdg, sector 0
谢谢!
答案1
您可以使用智能值:例如:
smartctl -a -d 3ware,2 /dev/twe0
引用 smartctl 的手册页:
Under Linux and FreeBSD, to look at ATA disks behind 3ware SCSI RAID controllers, use syntax such as:
smartctl -a -d 3ware,2 /dev/sda
smartctl -a -d 3ware,0 /dev/twe0
smartctl -a -d 3ware,1 /dev/twa0
where in the argument 3ware,N, the integer N is the disk number (3ware ´port´) within the 3ware ATA RAID controller. The allowed values of N are from 0 to 31 inclusive. The first two
forms, which refer to devices /dev/sda-z and /dev/twe0-15, may be used with 3ware series 6000, 7000, and 8000 series controllers that use the 3x-xxxx driver. Note that the /dev/sda-z form
is deprecated starting with the Linux 2.6 kernel series and may not be supported by the Linux kernel in the near future. The final form, which refers to devices /dev/twa0-15, must be used
with 3ware 9000 series controllers, which use the 3w-9xxx driver.
Note that if the special character device nodes /dev/twa? and /dev/twe? do not exist, or exist with the incorrect major or minor numbers, smartctl will recreate them on the fly. Typically
/dev/twa0 refers to the first 9000-series controller, /dev/twa1 refers to the second 9000 series controller, and so on. Likewise /dev/twe0 refers to the first 6/7/8000-series controller,
/dev/twa1 refers to the second 6/7/8000 series controller, and so on.
关于你的问题的一些想法:
这可能还不是 HDD/SSD 的全面问题,但我建议尽快更换它。
如果还没有备份,请立即备份!
您可以使用以下方法检查问题:
e2fsck -fv /dev/sdX
如果您看到智能内重新分配的扇区,我认为您应该更换驱动器。
答案2
您可以使用以下方式访问智能信息(例如):
smartctl -a -d 3ware,N /dev/twa0
N 是端口号,twa0 是控制器。
通过下面的方法你可以得到一些接口错误统计数据:
smartctl -l sataphy -d 3ware,N /dev/twa0
使用该命令,我能够确定日志中不断出现的“ata 异常”是接口/电缆错误的结果,因为 CRC 计数增加了(最终需要用不同类型的磁盘替换磁盘。用相同类型的主板替换主板没有帮助)。从中可以看出,普通的 SATA 控制器比 3Ware 端口提供更多信息。
至于“缓冲区错误”,我从未遇到过,所以我无法推测。我过去曾遇到过多次“ata 异常”(在软件 RAID 上),这几乎总是故障的前兆。因此,我现在扫描我的日志以查找该错误。