我的服务器大约每周崩溃一次,并且没有留下任何线索来说明原因。我检查过/var/log/messages
,它只是在某个时刻停止记录,并在我执行硬重启时从计算机开机信息开始。
我可以检查一些东西或安装软件来确定原因吗?
我正在运行 CentOS 7。
这是我的唯一的错误/问题/var/log/dmesg
:https://paste.netcoding.net/cosisiloji.log
[ 3.606936] md: Waiting for all devices to be available before autodetect
[ 3.606984] md: If you don't use raid, use raid=noautodetect
[ 3.607085] md: Autodetecting RAID arrays.
[ 3.608309] md: Scanned 6 and added 6 devices.
[ 3.608362] md: autorun ...
[ 3.608412] md: considering sdc2 ...
[ 3.608464] md: adding sdc2 ...
[ 3.608516] md: sdc1 has different UUID to sdc2
[ 3.608570] md: adding sdb2 ...
[ 3.608620] md: sdb1 has different UUID to sdc2
[ 3.608674] md: adding sda2 ...
[ 3.608726] md: sda1 has different UUID to sdc2
[ 3.608944] md: created md2
[ 3.608997] md: bind<sda2>
[ 3.609058] md: bind<sdb2>
[ 3.609116] md: bind<sdc2>
[ 3.609175] md: running: <sdc2><sdb2><sda2>
[ 3.609548] md/raid1:md2: active with 3 out of 3 mirrors
[ 3.609623] md2: detected capacity change from 0 to 98520989696
[ 3.609685] md: considering sdc1 ...
[ 3.609737] md: adding sdc1 ...
[ 3.609789] md: adding sdb1 ...
[ 3.609841] md: adding sda1 ...
[ 3.610005] md: created md1
[ 3.610055] md: bind<sda1>
[ 3.610117] md: bind<sdb1>
[ 3.610175] md: bind<sdc1>
[ 3.610233] md: running: <sdc1><sdb1><sda1>
[ 3.610714] md/raid1:md1: not clean -- starting background reconstruction
[ 3.610773] md/raid1:md1: active with 3 out of 3 mirrors
[ 3.610854] md1: detected capacity change from 0 to 20970405888
[ 3.610917] md: ... autorun DONE.
[ 3.610999] md: resync of RAID array md1
[ 3.611054] md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
[ 3.611119] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for resync.
[ 3.611180] md: using 128k window, over a total of 20478912k.
[ 3.611244] md1: unknown partition table
[ 3.624786] EXT3-fs (md1): error: couldn't mount because of unsupported optional features (240)
[ 3.627095] EXT2-fs (md1): error: couldn't mount because of unsupported optional features (244)
[ 3.630284] EXT4-fs (md1): INFO: recovery required on readonly filesystem
[ 3.630341] EXT4-fs (md1): write access will be enabled during recovery
[ 3.819411] EXT4-fs (md1): orphan cleanup on readonly fs
[ 3.836922] EXT4-fs (md1): 24 orphan inodes deleted
[ 3.836975] EXT4-fs (md1): recovery complete
[ 3.840557] EXT4-fs (md1): mounted filesystem with ordered data mode. Opts: (null)
答案1
如果您已crashkernel/kdump
安装并启用,则应该能够使用该crash
实用程序相对轻松地检查崩溃的内核。例如,假设您的崩溃内核转储保存在/var/crash
:
crash /var/crash/2009-07-17-10\:36/vmcore \
/usr/lib/debug/lib/modules/`uname -r`/vmlinux`
看着Fedora 的 kdump 调试指南和Red Hat 的崩溃转储指南 (PDF)了解更多详细信息。
答案2
您可以检查 上的 dmesg 文件/var/log/dmesg
,该文件记录了内核消息。消息日志仅记录服务和应用程序消息,如果出现内核错误,服务和应用程序将停止运行,但内核错误仍会记录在 dmesg 中。
答案3
- bios 内存测试
- bios 硬盘测试
- 检查智能驱动器日志
smartctl /dev/sda -a
- 智能驾驶测试
dmesg -wH
在窗口中保持运行