dmesg 垃圾邮件“发生开机或设备重置”

dmesg 垃圾邮件“发生开机或设备重置”

我有一台 DELL R730xd 服务器,在 PERC H730P Mini 控制器上装有 24 个 SSD,处于 HBA 模式。所有 24 个驱动器都位于一个大型 ZFS Raid-Z2 池中。

我注意到,基本上所有磁盘的 dmesg 都会收到大量此类消息:

...
[Tue Mar 12 03:38:50 2024] sd 0:0:8:0: Power-on or device reset occurred
[Tue Mar 12 03:39:58 2024] sd 0:0:3:0: Power-on or device reset occurred
[Tue Mar 12 03:39:58 2024] sd 0:0:19:0: Power-on or device reset occurred
[Tue Mar 12 03:41:05 2024] sd 0:0:3:0: Power-on or device reset occurred
[Tue Mar 12 03:42:13 2024] sd 0:0:11:0: Power-on or device reset occurred
[Tue Mar 12 03:42:13 2024] sd 0:0:13:0: Power-on or device reset occurred
[Tue Mar 12 03:43:21 2024] sd 0:0:8:0: Power-on or device reset occurred
[Tue Mar 12 03:44:29 2024] sd 0:0:2:0: Power-on or device reset occurred
[Tue Mar 12 03:45:34 2024] sd 0:0:3:0: Power-on or device reset occurred
[Tue Mar 12 03:46:45 2024] sd 0:0:9:0: Power-on or device reset occurred
[Tue Mar 12 03:47:52 2024] sd 0:0:3:0: Power-on or device reset occurred
[Tue Mar 12 03:47:52 2024] sd 0:0:18:0: Power-on or device reset occurred
[Tue Mar 12 03:48:00 2024] sd 0:0:9:0: Power-on or device reset occurred
[Tue Mar 12 03:49:09 2024] sd 0:0:6:0: Power-on or device reset occurred
[Tue Mar 12 03:50:16 2024] sd 0:0:13:0: Power-on or device reset occurred
[Tue Mar 12 03:51:21 2024] sd 0:0:15:0: Power-on or device reset occurred
[Tue Mar 12 03:52:27 2024] sd 0:0:19:0: Power-on or device reset occurred
[Tue Mar 12 03:53:55 2024] sd 0:0:22:0: Power-on or device reset occurred
[Tue Mar 12 03:53:55 2024] sd 0:0:0:0: Power-on or device reset occurred
[Tue Mar 12 03:54:04 2024] sd 0:0:9:0: Power-on or device reset occurred
[Tue Mar 12 03:55:09 2024] sd 0:0:9:0: Power-on or device reset occurred
[Tue Mar 12 03:56:13 2024] sd 0:0:9:0: Power-on or device reset occurred
[Tue Mar 12 03:57:21 2024] sd 0:0:0:0: Power-on or device reset occurred
...

除这些之外我没有收到任何其他错误,并且磁盘本身很健康并且没有报告任何错误。

到目前为止,我的观察指向同时发生的磁盘密集型任务,如虚拟机备份或 zfs 清理。但磁盘密集型任务不应该改变硬盘的电源状态(开/关),对吧?我如何知道这个问题的根本原因?

相关内容