环境:
- LTP 版本:ltp-full-20230127
- CPU:FT-2000+(arm64)
- 运行 LTP 命令:
nohup ./runltp -p -l /home/result.log -d /home -t 168h &
错误描述:
当我运行时ltp alltest
,经过几次循环的 alltest,机器将变得无响应,桌面和其他程序或守护程序死机,但 Linux 内核还活着,当我从 USB 端口连接或断开 USB 鼠标/键盘时,串行端口会打印一些驱动程序日志。
由于 dio30 导致“僵尸”状态,它分叉了 100 个子任务,并从文件并行的 diff 偏移量调用writev
/ 。readv
以下是调试串行的异常日志。
[120193.657816][ T4867] LTP: starting dio29 (diotest3 -b 65536 -n 100 -i 100 -o 1024000)
[120235.231643][ T4867] LTP: starting dio30 (diotest6 -b 65536 -n 100 -i 100 -o 1024000)
[120309.495026][ T520] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[120309.502838][ T520] ata1.00: failed command: FLUSH CACHE EXT
[120309.508568][ T520] ata1.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 3
[120309.508568][ T520] res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[120309.523626][ T520] ata1.00: status: { DRDY }
[120309.528059][ T520] ata1: hard resetting link
[120309.845188][ T520] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[120315.083035][ T520] ata1.00: qc timeout (cmd 0xec)
[120315.088968][ T520] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[120315.095816][ T520] ata1.00: revalidation failed (errno=-5)
[120315.101456][ T520] ata1: hard resetting link
[120315.421212][ T520] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[120325.579067][ T520] ata1.00: qc timeout (cmd 0xec)
[120325.584993][ T520] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[120325.591839][ T520] ata1.00: revalidation failed (errno=-5)
[120325.597477][ T520] ata1: limiting SATA link speed to 3.0 Gbps
[120325.603375][ T520] ata1: hard resetting link
[120325.921250][ T520] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 320)
[120356.555202][ T520] ata1.00: qc timeout (cmd 0xec)
[120356.561140][ T520] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[120356.567989][ T520] ata1.00: revalidation failed (errno=-5)
[120356.573627][ T520] ata1.00: disabled
[120356.893376][ T520] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 320)
[120356.901385][ T520] ata1: EH complete
[120356.905154][ C1] sd 0:0:0:0: [sda] tag#3 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=107s
[120356.915726][ C1] sd 0:0:0:0: [sda] tag#3 CDB: Synchronize Cache(10) 35 00 00 00 00 00 00 00 00 00
[120356.924904][ C1] print_req_error: 8 callbacks suppressed
[120356.924908][ C1] blk_update_request: I/O error, dev sda, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0
[120356.941398][ C18] sd 0:0:0:0: [sda] tag#6 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=0s
[120356.951796][ C18] sd 0:0:0:0: [sda] tag#6 CDB: Synchronize Cache(10) 35 00 00 00 00 00 00 00 00 00
[120356.960982][ C18] blk_update_request: I/O error, dev sda, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0
[120356.971813][ C18] blk_update_request: I/O error, dev sda, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0
[120356.982635][ C18] blk_update_request: I/O error, dev sda, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0
[120356.993457][ C18] blk_update_request: I/O error, dev sda, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0
[120357.004277][ C18] blk_update_request: I/O error, dev sda, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0
[120357.015098][ C18] blk_update_request: I/O error, dev sda, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0
[120357.025919][ C18] blk_update_request: I/O error, dev sda, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0
[120357.036738][ C18] blk_update_request: I/O error, dev sda, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0
[120357.047559][ C18] blk_update_request: I/O error, dev sda, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0
[120357.058502][ C2] sd 0:0:0:0: [sda] tag#21 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=0s
[120357.058519][ C35] sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=0s
Mar 23 03:11:46 [120357.058590][ C32] sd 0:0:0:0: [sda] tag#26 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=0s
ltptest kernel: [120357.058596][ C32] sd 0:0:0:0: [sda] tag#26 CDB: Write(10) 2a 00 1b d7 9f 00 00 00 08 00
ata1.00: excepti[120357.058614][ C32] sd 0:0:0:0: [sda] tag#27 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=0s
on Emask 0x0 SAc[120357.058616][ C32] sd 0:0:0:0: [sda] tag#27 CDB: Write(10) 2a 00 1b a7 a8 b0 00 00 08 00
t 0x0 SErr 0x0 a[120357.058636][T813309] dm-0: writeback error on inode 201930164, offset 32047104, sector 113798912
ction 0x6 frozen[120357.058640][T813309] dm-0: writeback error on inode 201930629, offset 913408, sector 110655664
测试目录比较特殊,它的挂载目录结合了SSD分区和HDD磁盘分区。
[root@ltptest home]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 238.5G 0 disk
└─sda1 8:1 0 238.5G 0 part
├─uos-root 253:0 0 70G 0 lvm /
└─uos-home 253:2 0 3.8T 0 lvm /home
sdb 8:16 0 3.6T 0 disk
├─sdb1 8:17 0 600M 0 part /boot/efi
├─sdb2 8:18 0 1G 0 part /boot
└─sdb3 8:19 0 3.6T 0 part
├─uos-swap 253:1 0 4G 0 lvm
└─uos-home 253:2 0 3.8T 0 lvm /home
从调试串行日志中,我只得到当diotest6将数据刷新到存储时SSD可能已死。但是我不知道是什么错误导致了这个错误,有人对这个错误有一些想法吗?