我在 Linux SMP 实例上运行了一个 Mongo 2.6.9 进程。以下是uname -a
删除的服务器名称:
Linux xxx.xxxx.com 3.13.0-042stab085.20 #1 SMP Sun Jul 20 13:27:24 MSK 2014 x86_64 x86_64 x86_64 GNU/Linux
过去,Mongo 实例崩溃过几次,但这种情况并不常见(每隔一个月发生一次),我也没有调查过。今天已经发生了 5 次。当然,我首先查看了日志文件。没有什么让我感到恐慌的。这是崩溃前日志文件的结尾:
2015-05-27T18:29:12.547-0400 [clientcursormon] mem (MB) res:27 virt:691
2015-05-27T18:29:12.547-0400 [clientcursormon] mapped (incl journal view):480
2015-05-27T18:29:12.547-0400 [clientcursormon] connections:10
2015-05-27T18:30:12.435-0400 [DataFileSync] flushing mmaps took 0ms for 6 files
然后我编辑了/etc/mongod.conf
,并将其更改verbose
为true
。现在,除了上述内容之外,我只得到了一些看起来很正常的TTLMonitor
行:
2015-05-27T18:33:12.435-0400 [DataFileSync] flushing mmaps took 0ms for 6 files
2015-05-27T18:33:12.495-0400 [TTLMonitor] query admin.system.indexes query: { expireAfterSeconds: { $exists: true } } planSummary: COLLSCAN ntoreturn:0 ntoskip:0 nscanned:3 nscannedObjects:3 keyUpdates:0 numYields:0 locks(micros) r:315 nreturned:0 reslen:20 0ms
2015-05-27T18:33:12.496-0400 [TTLMonitor] query dtnajobs.system.indexes query: { expireAfterSeconds: { $exists: true } } planSummary: COLLSCAN ntoreturn:0 ntoskip:0 nscanned:5 nscannedObjects:5 keyUpdates:0 numYields:0 locks(micros) r:51 nreturned:0 reslen:20 0ms
2015-05-27T18:33:12.496-0400 [TTLMonitor] query local.system.indexes query: { expireAfterSeconds: { $exists: true } } planSummary: COLLSCAN ntoreturn:0 ntoskip:0 nscanned:1 nscannedObjects:1 keyUpdates:0 numYields:0 locks(micros) r:74 nreturned:0 reslen:20 0ms
我在 MongoDB 文档页面上发现,对于此类崩溃,我应该尝试以下方法:
sudo grep mongod /var/log/messages
sudo grep score /var/log/messages
这台机器上不存在这两个文件,因此我使用 grep 命令全部日志文件,没有找到任何相关信息。
当我运行mongo
命令行客户端时,确实收到此警告:
2015-05-27T18:36:24.216-0400 [initandlisten] ** WARNING: You are running in OpenVZ which can cause issues on versions of RHEL older than RHEL6.
这可能是我应该做的
我确实需要一种方法来识别问题所在并解决它。欢迎任何建议。