从 4.19 升级后,我一直在使用 Linux 内核 5.4.35(和更新版本),从那以后,我的 hpsa md RAID 0 在几天(2-3 天)后挂起,RAID 更改为只读/I/O拒绝。 (编译自 Debian“Vanilla Kernel”)
如果我检查 SMART 统计数据,则不会显示严重/重要错误。
我还使用 hpsahba 的 6 个补丁,可以在 Github 上找到这里。
以下是相应的系统日志:完整的系统日志可以在 Pastebin 上找到这里
Apr 30 15:58:31 srv381 kernel: [544209.588021] sd 0:0:10:0: [sdj] tag#173 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Apr 30 15:58:31 srv381 kernel: [544209.588026] sd 0:0:10:0: [sdj] tag#173 Sense Key : Medium Error [current] [descriptor]
Apr 30 15:58:31 srv381 kernel: [544209.588028] sd 0:0:10:0: [sdj] tag#173 Add. Sense: Record not found
Apr 30 15:58:31 srv381 kernel: [544209.588032] sd 0:0:10:0: [sdj] tag#173 CDB: Write(16) 8a 00 00 00 00 01 91 28 00 00 00 00 01 30 00 00
Apr 30 15:58:31 srv381 kernel: [544209.588035] blk_update_request: critical medium error, dev sdj, sector 6730285056 op 0x1:(WRITE) flags 0x100000 phys_seg 5 prio class 0
Apr 30 15:58:42 srv381 kernel: [544220.603519] sd 0:0:10:0: [sdj] tag#179 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Apr 30 15:58:42 srv381 kernel: [544220.603523] sd 0:0:10:0: [sdj] tag#179 Sense Key : Medium Error [current] [descriptor]
Apr 30 15:58:42 srv381 kernel: [544220.603527] sd 0:0:10:0: [sdj] tag#179 Add. Sense: Unrecovered read error
Apr 30 15:58:42 srv381 kernel: [544220.603530] sd 0:0:10:0: [sdj] tag#179 CDB: Read(16) 88 00 00 00 00 00 4a d1 69 b0 00 00 02 50 00 00
Apr 30 15:58:42 srv381 kernel: [544220.603533] blk_update_request: critical medium error, dev sdj, sector 1255238064 op 0x0:(READ) flags 0x80700 phys_seg 16 prio class 0
Apr 30 15:59:05 srv381 kernel: [544243.400236] XFS (md0p2): writeback error on sector 6730284320
Apr 30 15:59:41 srv381 kernel: [544279.528345] sd 0:0:10:0: [sdj] tag#143 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Apr 30 15:59:41 srv381 kernel: [544279.528352] sd 0:0:10:0: [sdj] tag#143 Sense Key : Medium Error [current] [descriptor]
Apr 30 15:59:41 srv381 kernel: [544279.528354] sd 0:0:10:0: [sdj] tag#143 Add. Sense: Record not found
Apr 30 15:59:41 srv381 kernel: [544279.528358] sd 0:0:10:0: [sdj] tag#143 CDB: Write(16) 8a 00 00 00 00 01 91 2c c2 c8 00 00 01 38 00 00
Apr 30 15:59:41 srv381 kernel: [544279.528361] blk_update_request: critical medium error, dev sdj, sector 6730597064 op 0x1:(WRITE) flags 0x100000 phys_seg 20 prio class 0
Apr 30 15:59:41 srv381 kernel: [544279.557380] XFS (md0p2): writeback error on sector 6730597056
Apr 30 16:00:19 srv381 kernel: [544317.433932] hpsa 0000:05:00.0: scsi 0:0:10:0: resetting physical Direct-Access ATA TP04000GB PHYS DRV SSDSmartPathCap- En- Exp=1
Apr 30 16:00:24 srv381 kernel: [544322.470747] hpsa 0000:05:00.0: waiting 2 secs for device to become ready.
Apr 30 16:00:26 srv381 kernel: [544324.497534] hpsa 0000:05:00.0: waiting 4 secs for device to become ready.
Apr 30 16:00:30 srv381 kernel: [544328.529549] hpsa 0000:05:00.0: waiting 8 secs for device to become ready.
Apr 30 16:00:38 srv381 kernel: [544336.721590] hpsa 0000:05:00.0: waiting 16 secs for device to become ready.
Apr 30 16:00:54 srv381 kernel: [544352.849662] hpsa 0000:05:00.0: waiting 32 secs for device to become ready.
Apr 30 16:01:27 srv381 kernel: [544385.617802] hpsa 0000:05:00.0: waiting 32 secs for device to become ready.
Apr 30 16:02:00 srv381 kernel: [544418.386133] hpsa 0000:05:00.0: waiting 32 secs for device to become ready.
Apr 30 16:02:32 srv381 kernel: [544451.154095] hpsa 0000:05:00.0: waiting 32 secs for device to become ready.
Apr 30 16:02:55 srv381 kernel: [544473.682061] INFO: task jbd2/sda2-8:270 blocked for more than 120 seconds.
Apr 30 16:02:55 srv381 kernel: [544473.682101] Tainted: G I E 5.4.35-custom #1
Apr 30 16:02:55 srv381 kernel: [544473.682128] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 30 16:02:55 srv381 kernel: [544473.682164] jbd2/sda2-8 D 0 270 2 0x80004000
Apr 30 16:02:55 srv381 kernel: [544473.682166] Call Trace:
Apr 30 16:02:55 srv381 kernel: [544473.682176] ? __schedule+0x2e3/0x740
Apr 30 16:02:55 srv381 kernel: [544473.682178] ? bit_wait_timeout+0x90/0x90
Apr 30 16:02:55 srv381 kernel: [544473.682179] schedule+0x39/0xa0
Apr 30 16:02:55 srv381 kernel: [544473.682181] io_schedule+0x12/0x40
Apr 30 16:02:55 srv381 kernel: [544473.682182] bit_wait_io+0xd/0x50
Apr 30 16:02:55 srv381 kernel: [544473.682184] __wait_on_bit+0x2a/0x90
Apr 30 16:02:55 srv381 kernel: [544473.682186] out_of_line_wait_on_bit+0x92/0xb0
Apr 30 16:02:55 srv381 kernel: [544473.682190] ? var_wake_function+0x20/0x20
Apr 30 16:02:55 srv381 kernel: [544473.682198] jbd2_journal_commit_transaction+0x107c/0x1930 [jbd2]
Apr 30 16:02:55 srv381 kernel: [544473.682203] ? try_to_del_timer_sync+0x4f/0x80
Apr 30 16:02:55 srv381 kernel: [544473.682208] kjournald2+0xb7/0x280 [jbd2]
Apr 30 16:02:55 srv381 kernel: [544473.682210] ? finish_wait+0x80/0x80
Apr 30 16:02:55 srv381 kernel: [544473.682213] kthread+0xf9/0x130
Apr 30 16:02:55 srv381 kernel: [544473.682217] ? commit_timeout+0x10/0x10 [jbd2]
Apr 30 16:02:55 srv381 kernel: [544473.682219] ? kthread_park+0x90/0x90
Apr 30 16:02:55 srv381 kernel: [544473.682222] ret_from_fork+0x35/0x40
Apr 30 16:02:55 srv381 kernel: [544473.682228] INFO: task rs:main Q:Reg:917 blocked for more than 120 seconds.
Apr 30 16:02:55 srv381 kernel: [544473.682261] Tainted: G I E 5.4.35-custom #1
Apr 30 16:02:55 srv381 kernel: [544473.682288] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 30 16:02:55 srv381 kernel: [544473.682323] rs:main Q:Reg D 0 917 1 0x00000000
Apr 30 16:02:55 srv381 kernel: [544473.682325] Call Trace:
Apr 30 16:02:55 srv381 kernel: [544473.682328] ? __schedule+0x2e3/0x740
Apr 30 16:02:55 srv381 kernel: [544473.682329] ? _cond_resched+0x15/0x30
Apr 30 16:02:55 srv381 kernel: [544473.682331] ? bit_wait_timeout+0x90/0x90
Apr 30 16:02:55 srv381 kernel: [544473.682332] schedule+0x39/0xa0
Apr 30 16:02:55 srv381 kernel: [544473.682334] io_schedule+0x12/0x40
Apr 30 16:02:55 srv381 kernel: [544473.682335] bit_wait_io+0xd/0x50
Apr 30 16:02:55 srv381 kernel: [544473.682337] __wait_on_bit+0x2a/0x90
Apr 30 16:02:55 srv381 kernel: [544473.682338] out_of_line_wait_on_bit+0x92/0xb0
Apr 30 16:02:55 srv381 kernel: [544473.682340] ? var_wake_function+0x20/0x20
Apr 30 16:02:55 srv381 kernel: [544473.682345] do_get_write_access+0x297/0x3e0 [jbd2]
Apr 30 16:02:55 srv381 kernel: [544473.682350] jbd2_journal_get_write_access+0x5c/0x80 [jbd2]
Apr 30 16:02:55 srv381 kernel: [544473.682372] __ext4_journal_get_write_access+0x37/0x80 [ext4]
Apr 30 16:02:55 srv381 kernel: [544473.682385] ? ext4_dirty_inode+0x44/0x60 [ext4]
Apr 30 16:02:55 srv381 kernel: [544473.682398] ext4_reserve_inode_write+0x93/0xc0 [ext4]
Apr 30 16:02:55 srv381 kernel: [544473.682412] ext4_mark_inode_dirty+0x51/0x1d0 [ext4]
Apr 30 16:02:55 srv381 kernel: [544473.682416] ? jbd2__journal_start+0xdc/0x1e0 [jbd2]
Apr 30 16:02:55 srv381 kernel: [544473.682429] ext4_dirty_inode+0x44/0x60 [ext4]
Apr 30 16:02:55 srv381 kernel: [544473.682432] __mark_inode_dirty+0x262/0x380
Apr 30 16:02:55 srv381 kernel: [544473.682435] generic_update_time+0x9d/0xc0
Apr 30 16:02:55 srv381 kernel: [544473.682437] file_update_time+0xeb/0x140
Apr 30 16:02:55 srv381 kernel: [544473.682439] __generic_file_write_iter+0x96/0x1d0
Apr 30 16:02:55 srv381 kernel: [544473.682452] ext4_file_write_iter+0xb6/0x360 [ext4]
Apr 30 16:02:55 srv381 kernel: [544473.682456] new_sync_write+0x12d/0x1d0
Apr 30 16:02:55 srv381 kernel: [544473.682459] vfs_write+0xb6/0x1a0
Apr 30 16:02:55 srv381 kernel: [544473.682461] ksys_write+0x5f/0xe0
Apr 30 16:02:55 srv381 kernel: [544473.682465] do_syscall_64+0x52/0x160
Apr 30 16:02:55 srv381 kernel: [544473.682467] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Apr 30 16:02:55 srv381 kernel: [544473.682469] RIP: 0033:0x7ffa65862e0f
Apr 30 16:02:55 srv381 kernel: [544473.682474] Code: Bad RIP value.
Apr 30 16:02:55 srv381 kernel: [544473.682475] RSP: 002b:00007ffa64936860 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
Apr 30 16:02:55 srv381 kernel: [544473.682477] RAX: ffffffffffffffda RBX: 00007ffa5c06b7a0 RCX: 00007ffa65862e0f
Apr 30 16:02:55 srv381 kernel: [544473.682478] RDX: 000000000000006d RSI: 00007ffa5c06b7a0 RDI: 000000000000000c
Apr 30 16:02:55 srv381 kernel: [544473.682479] RBP: 00007ffa5c004ea0 R08: 0000000000000000 R09: 0000000000000000
Apr 30 16:02:55 srv381 kernel: [544473.682480] R10: 0000000000000000 R11: 0000000000000293 R12: 00007ffa5c00a120
Apr 30 16:02:55 srv381 kernel: [544473.682481] R13: 000000000000006d R14: 0000000000000000 R15: 0000000000000000
Apr 30 16:02:55 srv381 kernel: [544473.682491] INFO: task deluged:10450 blocked for more than 120 seconds.
Apr 30 16:02:55 srv381 kernel: [544473.682523] Tainted: G I E 5.4.35-custom #1
Apr 30 16:02:55 srv381 kernel: [544473.682550] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 30 16:02:55 srv381 kernel: [544473.682585] deluged D 0 10450 1 0x00000000
Apr 30 16:02:55 srv381 kernel: [544473.682587] Call Trace:
Apr 30 16:02:55 srv381 kernel: [544473.682590] ? __schedule+0x2e3/0x740
Apr 30 16:02:55 srv381 kernel: [544473.682592] schedule+0x39/0xa0
Apr 30 16:02:55 srv381 kernel: [544473.682595] rwsem_down_write_slowpath+0x24c/0x510
Apr 30 16:02:55 srv381 kernel: [544473.682649] ? xfs_file_buffered_aio_write+0x72/0x340 [xfs]
Apr 30 16:02:55 srv381 kernel: [544473.682684] xfs_ilock+0xeb/0xf0 [xfs]
Apr 30 16:02:55 srv381 kernel: [544473.682719] xfs_file_buffered_aio_write+0x72/0x340 [xfs]
Apr 30 16:02:55 srv381 kernel: [544473.682724] do_iter_readv_writev+0x158/0x1d0
Apr 30 16:02:55 srv381 kernel: [544473.682726] do_iter_write+0x7d/0x190
Apr 30 16:02:55 srv381 kernel: [544473.682727] vfs_writev+0xa6/0xf0
Apr 30 16:02:55 srv381 kernel: [544473.682731] ? ep_modify+0x14c/0x170
Apr 30 16:02:55 srv381 kernel: [544473.682733] ? __x64_sys_epoll_ctl+0xe5/0x670
Apr 30 16:02:55 srv381 kernel: [544473.682734] do_pwritev+0x8c/0xd0
Apr 30 16:02:55 srv381 kernel: [544473.682737] do_syscall_64+0x52/0x160
Apr 30 16:02:55 srv381 kernel: [544473.682739] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Apr 30 16:02:55 srv381 kernel: [544473.682741] RIP: 0033:0x7fc0da72d6a0
Apr 30 16:02:55 srv381 kernel: [544473.682743] Code: 3c 24 48 89 4c 24 18 e8 be 00 f9 ff 4c 8b 54 24 18 8b 3c 24 45 31 c0 41 89 c1 8b 54 24 14 48 8b 74 24 08 b8 28 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2c 44 89 cf 48 89 04 24 e8 ec 00 f9 ff 48 8b
Apr 30 16:02:55 srv381 kernel: [544473.682744] RSP: 002b:00007fc0d5b510f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000128
Apr 30 16:02:55 srv381 kernel: [544473.682745] RAX: ffffffffffffffda RBX: 00007fc0d5b51190 RCX: 00007fc0da72d6a0
Apr 30 16:02:55 srv381 kernel: [544473.682746] RDX: 0000000000000001 RSI: 00007fc0d5b51190 RDI: 0000000000001696
Apr 30 16:02:55 srv381 kernel: [544473.682747] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
Apr 30 16:02:55 srv381 kernel: [544473.682748] R10: 0000000004dbadbe R11: 0000000000000246 R12: 0000000000000001
Apr 30 16:02:55 srv381 kernel: [544473.682749] R13: 0000000000001696 R14: 0000000000000001 R15: 0000000004dbadbe
Apr 30 16:02:55 srv381 kernel: [544473.682751] INFO: task deluged:10452 blocked for more than 120 seconds.
Apr 30 16:02:55 srv381 kernel: [544473.682783] Tainted: G I E 5.4.35-custom #1
Apr 30 16:02:55 srv381 kernel: [544473.682810] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 30 16:02:55 srv381 kernel: [544473.682845] deluged D 0 10452 1 0x00000000
Apr 30 16:02:55 srv381 kernel: [544473.682847] Call Trace:
Apr 30 16:02:55 srv381 kernel: [544473.682849] ? __schedule+0x2e3/0x740
Apr 30 16:02:55 srv381 kernel: [544473.682853] ? enqueue_task_fair+0x8c/0x4c0
Apr 30 16:02:55 srv381 kernel: [544473.682854] schedule+0x39/0xa0
Apr 30 16:02:55 srv381 kernel: [544473.682856] rwsem_down_write_slowpath+0x24c/0x510
Apr 30 16:02:55 srv381 kernel: [544473.682894] ? xfs_file_buffered_aio_write+0x72/0x340 [xfs]
Apr 30 16:02:55 srv381 kernel: [544473.682928] xfs_ilock+0xeb/0xf0 [xfs]
Apr 30 16:02:55 srv381 kernel: [544473.682963] xfs_file_buffered_aio_write+0x72/0x340 [xfs]
Apr 30 16:02:55 srv381 kernel: [544473.682967] do_iter_readv_writev+0x158/0x1d0
Apr 30 16:02:55 srv381 kernel: [544473.682969] do_iter_write+0x7d/0x190
Apr 30 16:02:55 srv381 kernel: [544473.682970] vfs_writev+0xa6/0xf0
Apr 30 16:02:55 srv381 kernel: [544473.682973] ? ep_modify+0x14c/0x170
Apr 30 16:02:55 srv381 kernel: [544473.682975] ? __x64_sys_epoll_ctl+0xe5/0x670
Apr 30 16:02:55 srv381 kernel: [544473.682976] do_pwritev+0x8c/0xd0
Apr 30 16:02:55 srv381 kernel: [544473.682979] do_syscall_64+0x52/0x160
Apr 30 16:02:55 srv381 kernel: [544473.682981] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Apr 30 16:02:55 srv381 kernel: [544473.682982] RIP: 0033:0x7fc0da72d6a0
Apr 30 16:02:55 srv381 kernel: [544473.682984] Code: 3c 24 48 89 4c 24 18 e8 be 00 f9 ff 4c 8b 54 24 18 8b 3c 24 45 31 c0 41 89 c1 8b 54 24 14 48 8b 74 24 08 b8 28 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2c 44 89 cf 48 89 04 24 e8 ec 00 f9 ff 48 8b
Apr 30 16:02:55 srv381 kernel: [544473.682985] RSP: 002b:00007fc0d491f0f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000128
Apr 30 16:02:55 srv381 kernel: [544473.682986] RAX: ffffffffffffffda RBX: 00007fc0d491f190 RCX: 00007fc0da72d6a0
Apr 30 16:02:55 srv381 kernel: [544473.682987] RDX: 0000000000000001 RSI: 00007fc0d491f190 RDI: 0000000000001697
Apr 30 16:02:55 srv381 kernel: [544473.682988] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
Apr 30 16:02:55 srv381 kernel: [544473.682989] R10: 0000000005e5ccbe R11: 0000000000000246 R12: 0000000000000001
Apr 30 16:02:55 srv381 kernel: [544473.682990] R13: 0000000000001697 R14: 0000000000000001 R15: 0000000005e5ccbe
Apr 30 16:02:55 srv381 kernel: [544473.682992] INFO: task deluged:10454 blocked for more than 120 seconds.
Apr 30 16:02:55 srv381 kernel: [544473.683024] Tainted: G I E 5.4.35-custom #1
Apr 30 16:02:55 srv381 kernel: [544473.683051] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 30 16:02:55 srv381 kernel: [544473.683086] deluged D 0 10454 1 0x00000000
Apr 30 16:02:55 srv381 kernel: [544473.683088] Call Trace:
Apr 30 16:02:55 srv381 kernel: [544473.683090] ? __schedule+0x2e3/0x740
Apr 30 16:02:55 srv381 kernel: [544473.683092] schedule+0x39/0xa0
Apr 30 16:02:55 srv381 kernel: [544473.683094] rwsem_down_write_slowpath+0x24c/0x510
Apr 30 16:02:55 srv381 kernel: [544473.683131] ? xfs_file_buffered_aio_write+0x72/0x340 [xfs]
Apr 30 16:02:55 srv381 kernel: [544473.683166] xfs_ilock+0xeb/0xf0 [xfs]
Apr 30 16:02:55 srv381 kernel: [544473.683201] xfs_file_buffered_aio_write+0x72/0x340 [xfs]
Apr 30 16:02:55 srv381 kernel: [544473.683204] do_iter_readv_writev+0x158/0x1d0
Apr 30 16:02:55 srv381 kernel: [544473.683206] do_iter_write+0x7d/0x190
Apr 30 16:02:55 srv381 kernel: [544473.683208] vfs_writev+0xa6/0xf0
Apr 30 16:02:55 srv381 kernel: [544473.683210] ? ep_modify+0x14c/0x170
Apr 30 16:02:55 srv381 kernel: [544473.683212] ? __x64_sys_epoll_ctl+0xe5/0x670
Apr 30 16:02:55 srv381 kernel: [544473.683214] do_pwritev+0x8c/0xd0
Apr 30 16:02:55 srv381 kernel: [544473.683216] do_syscall_64+0x52/0x160
Apr 30 16:02:55 srv381 kernel: [544473.683218] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Apr 30 16:02:55 srv381 kernel: [544473.683220] RIP: 0033:0x7fc0da72d6a0
Apr 30 16:02:55 srv381 kernel: [544473.683221] Code: 3c 24 48 89 4c 24 18 e8 be 00 f9 ff 4c 8b 54 24 18 8b 3c 24 45 31 c0 41 89 c1 8b 54 24 14 48 8b 74 24 08 b8 28 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2c 44 89 cf 48 89 04 24 e8 ec 00 f9 ff 48 8b
Apr 30 16:02:55 srv381 kernel: [544473.683222] RSP: 002b:00007fc0cf7f70f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000128
Apr 30 16:02:55 srv381 kernel: [544473.683224] RAX: ffffffffffffffda RBX: 00007fc0cf7f7190 RCX: 00007fc0da72d6a0
Apr 30 16:02:55 srv381 kernel: [544473.683225] RDX: 0000000000000001 RSI: 00007fc0cf7f7190 RDI: 0000000000001697
Apr 30 16:02:55 srv381 kernel: [544473.683226] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
Apr 30 16:02:55 srv381 kernel: [544473.683226] R10: 0000000002e5ccbe R11: 0000000000000246 R12: 0000000000000001
Apr 30 16:02:55 srv381 kernel: [544473.683227] R13: 0000000000001697 R14: 0000000000000001 R15: 0000000002e5ccbe
Apr 30 16:02:55 srv381 kernel: [544473.683235] INFO: task kworker/2:2:21309 blocked for more than 120 seconds.
Apr 30 16:02:55 srv381 kernel: [544473.683268] Tainted: G I E 5.4.35-custom #1
Apr 30 16:02:55 srv381 kernel: [544473.683295] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 30 16:02:55 srv381 kernel: [544473.683330] kworker/2:2 D 0 21309 2 0x80004000
Apr 30 16:02:55 srv381 kernel: [544473.683370] Workqueue: xfs-sync/md0p2 xfs_log_worker [xfs]
Apr 30 16:02:55 srv381 kernel: [544473.683372] Call Trace:
Apr 30 16:02:55 srv381 kernel: [544473.683375] ? __schedule+0x2e3/0x740
Apr 30 16:02:55 srv381 kernel: [544473.683376] schedule+0x39/0xa0
Apr 30 16:02:55 srv381 kernel: [544473.683385] md_flush_request+0xa8/0x1b0 [md_mod]
答案1
尽管没有 SMART 错误,但事实是您的sdj
磁盘在实际使用时报告错误,并且它似乎正在影响您的md0p2
RAID 卷。
留言后
hpsa 0000:05:00.0: scsi 0:0:10:0: resetting physical Direct-Access ATA TP04000GB PHYS DRV SSDSmartPathCap- En- Exp=1
似乎有问题的磁盘完全停止响应。由于这是一个写回错误,这意味着内核缓存了一个写操作,并向用户空间应用程序“承诺”它将被写入磁盘上。现在事实证明,实际写入它是不可能的,对于 RAID 0,除了等待并希望磁盘再次开始响应之外,没有其他方法可以恢复。另一种选择是故意丢失数据,这是内核的事情只是不会自己做。
4 月 30 日 16:00:19,内核向磁盘发出重置命令以尝试从错误中恢复,而磁盘显然从未完成该命令。
根据系统日志,我准备宣布磁盘已损坏。死亡时间约为4月30日16:00:24。
如果电源循环使磁盘恢复,我将备份内容在采取任何其他行动之前绝对尽快。