Debian 服务器间歇性无法启动,可能与 MD/RAID 有关 - 我该如何调查?

Debian 服务器间歇性无法启动,可能与 MD/RAID 有关 - 我该如何调查?

偶尔(每60次启动中会有1次),我们的一台debian(2.6.32-45)服务器无法启动:

启动失败的输出到此结束:

...
[    7.831991] raid6: using algorithm sse2x4 (11665 MB/s)
[    7.839760] md: raid6 personality registered for level 6
[    7.839838] md: raid5 personality registered for level 5
[    7.839915] md: raid4 personality registered for level 4
[    7.853452] md: raid10 personality registered for level 10       <<<<<<< last line of output

将其与“良好”启动时的 dmesg 日志进行比较:

...
[    7.737313] md: raid6 personality registered for level 6
[    7.737314] md: raid5 personality registered for level 5
[    7.737315] md: raid4 personality registered for level 4
[    7.749987] md: raid10 personality registered for level 10       <<<<<<<< equivalent line
[    7.752653] mdadm: sending ioctl 1261 to a partition!
[    7.752655] mdadm: sending ioctl 1261 to a partition!
[    7.753571] mdadm: sending ioctl 1261 to a partition!
[    7.753574] mdadm: sending ioctl 1261 to a partition!
[    7.753769] mdadm: sending ioctl 1261 to a partition!
[    7.753771] mdadm: sending ioctl 1261 to a partition!
[    7.753975] mdadm: sending ioctl 1261 to a partition!
[    7.753978] mdadm: sending ioctl 1261 to a partition!
[    7.754322] mdadm: sending ioctl 1261 to a partition!
[    7.754325] mdadm: sending ioctl 1261 to a partition!
...

我假设挂起是由于“mdadm:向分区发送 ioctl 1261!”引起的。但我该怎么做才能进一步调查这个问题呢?

答案1

mdadm: sending ioctl 1261 to a partition!

此警告可以忽略,它显示在较新的版本中已被删除。也许您遇到了磁盘故障,但与此消息无关

答案2

我们已经将问题定位到 RAID1 和 RAID10 模块。此问题已在后续的 Debian 版本中修复。

提交在这里: https://github.com/torvalds/linux/commit/d6b42dcb995e6acd7cc276774e751ffc9f0ef4bf

相关内容