我正在使用 Oracle Linux Server 版本 7.9,我们已经在这台机器上安装了 Oracle 12c 并运行稳定负载。在运行测试超过 8-10 小时后的夜间,系统突然不可用,我们无法通过 ssh 访问它,
虽然我们已经从 zabbix 监控中分析过,但不可能有任何重大问题导致此问题。几乎系统 CPU、内存和所有东西都运行良好,但在此之后,即 2 月 1 日 03:00:45,zabbix 监控中没有报告进一步的系统统计信息。此外,在系统消息中,我们注意到 netdata 进程也被终止了
Feb 1 03:00:46 dbserver12c kernel: Out of memory: Kill process 40121 (netdata) score 1091 or sacrifice child
Feb 1 03:00:46 dbserver12c kernel: Killed process 40367 (go.d.plugin) total-vm:734908kB, anon-rss:7536kB, file-rss:0kB
Feb 1 03:00:46 dbserver12c kernel: sadc invoked oom-killer: gfp_mask=0x2000d0, order=0, oom_score_adj=0
以下是当时的/var/log/messages。
Feb 1 03:00:45 dbserver12c rtkit-daemon[1119]: The canary thread is apparently starving. Taking action.
Feb 1 03:00:46 dbserver12c journal: Missed 7209 kernel messages
Feb 1 03:00:46 dbserver12c kernel: [18900] 54321 18900 15544518 1030233 6865 0 0 oracle_18900_or
Feb 1 03:00:46 dbserver12c kernel: [18902] 54321 18902 15544520 1747806 11245 0 0 oracle_18902_or
Feb 1 03:00:46 dbserver12c kernel: [18904] 54321 18904 15544540 1440254 10264 0 0 oracle_18904_or
Feb 1 03:00:46 dbserver12c kernel: [18906] 54321 18906 15544518 1440851 11007 0 0 oracle_18906_or
Feb 1 03:00:46 dbserver12c kernel: [18908] 54321 18908 15544260 9409 1556 0 0 oracle_18908_or
Feb 1 03:00:46 dbserver12c kernel: [18910] 54321 18910 15544519 995625 7062 0 0 oracle_18910_or
Feb 1 03:00:46 dbserver12c kernel: [18912] 54321 18912 15544519 2130982 11237 0 0 oracle_18912_or
Feb 1 03:00:46 dbserver12c kernel: [18914] 54321 18914 15544518 1521359 10979 0 0 oracle_18914_or
Feb 1 03:00:46 dbserver12c kernel: [18916] 54321 18916 15544519 1352952 8390 0 0 oracle_18916_or
Feb 1 03:00:46 dbserver12c kernel: [18918] 54321 18918 15544258 8682 1496 0 0 oracle_18918_or
Feb 1 03:00:46 dbserver12c kernel: [18920] 54321 18920 15544520 2030372 11048 0 0 oracle_18920_or
Feb 1 03:00:46 dbserver12c kernel: [18922] 54321 18922 15544519 1149189 8514 0 0 oracle_18922_or
Feb 1 03:00:46 dbserver12c kernel: [18924] 54321 18924 15544518 919024 7466 0 0 oracle_18924_or
Feb 1 03:00:46 dbserver12c kernel: [18926] 54321 18926 15544519 792075 8359 0 0 oracle_18926_or
Feb 1 03:00:46 dbserver12c kernel: [18928] 54321 18928 15544257 7481 1256 0 0 oracle_18928_or
Feb 1 03:00:46 dbserver12c kernel: [18930] 54321 18930 15544520 1933402 11155 0 0 oracle_18930_or
Feb 1 03:00:46 dbserver12c kernel: [18932] 54321 18932 15544519 842684 8489 0 0 oracle_18932_or