由于 RAID 10 上的日志文件过多,MongoDB 辅助数据库在初始同步时崩溃

由于 RAID 10 上的日志文件过多,MongoDB 辅助数据库在初始同步时崩溃

我的辅助数据库服务器出现故障,因此我正在启动替代辅助服务器并尝试执行初始同步。我一直在遵循教程和建议在 Amazon EBS 上使用 RAIS 10

因此我在 RAID 10 中使用了 4x4GB EBS,并采用以下设置(这是 mongodb 当时建议的)

sudo lvcreate -l 90%vg -n data vg0
sudo lvcreate -l 5%vg -n log vg0
sudo lvcreate -l 5%vg -n journal vg0

由于我的主版本开始变旧(v3.2),我同时尝试升级到 3.4,所以我刚刚在 3.4 上启动了一个辅助版本(以防这可能与问题有关)

问题是,在初始同步期间,MongoDB 在 /journal 中填充了太多日志文件,总共分配了 4x100MB 的日志文件

ec2-user@secondary$ ll /journal/
total 369105
drwx------ 2 root   root       12288 Apr  3 14:47 lost+found
-rw-r--r-- 1 mongod mongod 104644096 Apr  3 19:00 WiredTigerLog.0000000001
-rw-r--r-- 1 mongod mongod 104685568 Apr  3 19:00 WiredTigerLog.0000000002
-rw-r--r-- 1 mongod mongod 104857600 Apr  3 19:00 WiredTigerLog.0000000003
-rw-r--r-- 1 mongod mongod 104857600 Apr  3 19:00 WiredTigerLog.0000000004

-rw-r--r-- 1 mongod mongod 0 4月 3 19:00 WiredTigerTmplog.0000000005

这超出了分配给日志的​​磁盘容量,并在初始同步期间导致严重崩溃

2018-04-03T19:00:18.821+0000 E STORAGE  [thread2] WiredTiger error (28) [1522782018:821142][6176:0x7efc0cd3d700], log-server: /data/journal/WiredTigerTmplog.0000000005: handle-write: pwrite: failed to write 128 bytes at offset 0: No space left on device
2018-04-03T19:00:18.821+0000 E STORAGE  [thread2] WiredTiger error (28) [1522782018:821213][6176:0x7efc0cd3d700], log-server: journal/WiredTigerTmplog.0000000005: fatal log failure: No space left on device
2018-04-03T19:00:18.821+0000 E STORAGE  [thread2] WiredTiger error (-31804) [1522782018:821228][6176:0x7efc0cd3d700], log-server: the process must exit and restart: WT_PANIC: WiredTiger library panic
2018-04-03T19:00:18.821+0000 I -        [InitialSyncInserters-my_job_glasses_production.ahoy_events0] Fatal Assertion 28559 at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp 64
2018-04-03T19:00:18.821+0000 I -        [InitialSyncInserters-my_job_glasses_production.ahoy_events0]

***aborting after fassert() failure


2018-04-03T19:00:18.821+0000 I -        [thread2] Fatal Assertion 28558 at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp 365
2018-04-03T19:00:18.821+0000 I -        [thread2]

***aborting after fassert() failure

我不太清楚为什么会发生这种情况,因为在我的主服务器上,只有 2 个日志文件,每个文件 100MB,所以我猜一切都应该没问题

ec2-user@primary$ ll /data/journal/ -h
total 205M
-rw-r--r-- 1 mongod mongod 4.1M Apr  3 18:49 WiredTigerLog.0000000059
-rw-r--r-- 1 mongod mongod 100M Apr  3 16:43 WiredTigerPreplog.0000000001
-rw-r--r-- 1 mongod mongod 100M Apr  3 16:43 WiredTigerPreplog.0000000002

我是否遗漏了什么或出了什么问题?这是我的 mongod.conf

systemLog:
  destination: file
  logAppend: true
  path: /log/mongod.log
  logRotate: reopen

storage:
  dbPath: /data
  journal:
    enabled: true

processManagement:
  fork: true  # fork and run in background
  pidFilePath: /var/run/mongodb/mongod.pid

net:
  port: 27017
  #bindIp added accordingly

security:
  authorization: enabled
  keyFile: /xxx.key

replication:
  replSetName: XXX

编辑:似乎在初始同步期间,MongoDB 会创建最多十几个文件,每个文件 100MB,然后再恢复到 4x100MB 文件。这在哪里记录?有没有办法对此进行限制?

相关内容