我刚刚安装了 Ubuntu 18.04 Server。我正尝试构建它来替换已死机的服务器。
每当我尝试做某事时,我都会遇到很多麻烦,OOM Killer 会终止我的登录信息、我的 rsync 以及许多其他东西。
我有 64G 内存和 64G 交换文件。我有两个 1.2TB 磁盘。
我尝试使用“free -h”和“vmstat”来查找内存泄漏,但没有看到任何有用的迹象。
这是从 kern.log 的一个片段,从全新启动结束时开始,到第一个“调用 oom-killer”事件。最终我的进程(可能是 rsync 或 ssh 登录)将被终止。我从来没有成功完成过 rsync。
“lowmem_reserve[]: 0 0 0 0”消息是否表明正在发生什么?
Feb 20 10:28:08 forest3 kernel: [ 10.516328] EXT4-fs (sdb1): mounted filesystem with ordered data mode. Opts: (null)
Feb 20 10:28:08 forest3 kernel: [ 11.359022] EXT4-fs (sda3): mounted filesystem with ordered data mode. Opts: (null)
Feb 20 10:28:08 forest3 kernel: [ 11.375396] EXT4-fs (sda2): mounted filesystem with ordered data mode. Opts: (null)
Feb 20 10:28:08 forest3 kernel: [ 11.870338] IPv6: ADDRCONF(NETDEV_UP): eno4: link is not ready
Feb 20 10:28:08 forest3 kernel: [ 12.414585] audit: type=1400 audit(1613838486.469:2): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/bin/lxc-start" pid=921 comm="apparmor_parser"
Feb 20 10:28:08 forest3 kernel: [ 12.436645] audit: type=1400 audit(1613838486.493:3): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxc-container-default" pid=919 comm="apparmor_parser"
Feb 20 10:28:08 forest3 kernel: [ 12.436648] audit: type=1400 audit(1613838486.493:4): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxc-container-default-cgns" pid=919 comm="apparmor_parser"
Feb 20 10:28:08 forest3 kernel: [ 12.436651] audit: type=1400 audit(1613838486.493:5): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxc-container-default-with-mounting" pid=919 comm="apparmor_parser"
Feb 20 10:28:08 forest3 kernel: [ 12.436653] audit: type=1400 audit(1613838486.493:6): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxc-container-default-with-nesting" pid=919 comm="apparmor_parser"
Feb 20 10:28:08 forest3 kernel: [ 12.484707] audit: type=1400 audit(1613838486.541:7): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/lib/snapd/snap-confine" pid=923 comm="apparmor_parser"
Feb 20 10:28:08 forest3 kernel: [ 12.484710] audit: type=1400 audit(1613838486.541:8): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/lib/snapd/snap-confine//mount-namespace-capture-helper" pid=923 comm="apparmor_parser"
Feb 20 10:28:08 forest3 kernel: [ 12.528993] audit: type=1400 audit(1613838486.585:9): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/sbin/tcpdump" pid=925 comm="apparmor_parser"
Feb 20 10:28:08 forest3 kernel: [ 12.530646] audit: type=1400 audit(1613838486.585:10): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/sbin/dhclient" pid=920 comm="apparmor_parser"
Feb 20 10:28:08 forest3 kernel: [ 12.530650] audit: type=1400 audit(1613838486.585:11): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/lib/NetworkManager/nm-dhcp-client.action" pid=920 comm="apparmor_parser"
Feb 20 10:28:08 forest3 kernel: [ 13.466381] Adding 67108860k swap on /users2/swapfile. Priority:-2 extents:43 across:74950140k FS
Feb 20 10:28:09 forest3 kernel: [ 15.395795] new mount options do not match the existing superblock, will be ignored
Feb 20 10:28:14 forest3 kernel: [ 20.328485] igb 0000:07:00.1 eno4: igb: eno4 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
Feb 20 10:28:14 forest3 kernel: [ 20.328697] IPv6: ADDRCONF(NETDEV_CHANGE): eno4: link becomes ready
Feb 20 10:30:35 forest3 kernel: [ 162.417980] EXT4-fs (sdc): mounted filesystem with ordered data mode. Opts: (null)
Feb 20 10:32:04 forest3 kernel: [ 250.474806] rsync invoked oom-killer: gfp_mask=0x15000c0(GFP_KERNEL_ACCOUNT), nodemask=(null), order=0, oom_score_adj=0
Feb 20 10:32:04 forest3 kernel: [ 250.474808] rsync cpuset=/ mems_allowed=0
Feb 20 10:32:04 forest3 kernel: [ 250.474813] CPU: 4 PID: 1986 Comm: rsync Tainted: G W 4.15.0-135-generic #139-Ubuntu
Feb 20 10:32:04 forest3 kernel: [ 250.474814] Hardware name: Dell Inc. PowerEdge R620/0PXXHP, BIOS 2.1.2 09/19/2013
Feb 20 10:32:04 forest3 kernel: [ 250.474815] Call Trace:
Feb 20 10:32:04 forest3 kernel: [ 250.474823] dump_stack+0x60/0x7e
Feb 20 10:32:04 forest3 kernel: [ 250.474827] dump_header+0x5a/0x229
Feb 20 10:32:04 forest3 kernel: [ 250.474830] ? ___ratelimit+0x79/0xf0
Feb 20 10:32:04 forest3 kernel: [ 250.474832] oom_kill_process+0x20a/0x3e0
Feb 20 10:32:04 forest3 kernel: [ 250.474834] out_of_memory+0xe9/0x2a0
Feb 20 10:32:04 forest3 kernel: [ 250.474836] __alloc_pages_slowpath+0xb05/0xbb0
Feb 20 10:32:04 forest3 kernel: [ 250.474839] __alloc_pages_nodemask+0x269/0x290
Feb 20 10:32:04 forest3 kernel: [ 250.474842] alloc_skb_with_frags+0xce/0x190
Feb 20 10:32:04 forest3 kernel: [ 250.474845] sock_alloc_send_pskb+0x1c3/0x1f0
Feb 20 10:32:04 forest3 kernel: [ 250.474848] ? _cond_resched+0x17/0x40
Feb 20 10:32:04 forest3 kernel: [ 250.474851] unix_stream_sendmsg+0x199/0x330
Feb 20 10:32:04 forest3 kernel: [ 250.474853] ? unix_getname+0xb0/0xb0
Feb 20 10:32:04 forest3 kernel: [ 250.474855] sock_sendmsg+0x32/0x40
Feb 20 10:32:04 forest3 kernel: [ 250.474856] sock_write_iter+0x8b/0xe0
Feb 20 10:32:04 forest3 kernel: [ 250.474859] new_sync_write+0xd0/0x130
Feb 20 10:32:04 forest3 kernel: [ 250.474861] ? sock_sendmsg+0x40/0x40
Feb 20 10:32:04 forest3 kernel: [ 250.474863] __vfs_write+0x37/0x50
Feb 20 10:32:04 forest3 kernel: [ 250.474865] vfs_write+0x94/0x1a0
Feb 20 10:32:04 forest3 kernel: [ 250.474867] SyS_write+0x4f/0xd0
Feb 20 10:32:04 forest3 kernel: [ 250.474870] do_fast_syscall_32+0x7f/0x200
Feb 20 10:32:04 forest3 kernel: [ 250.474873] entry_SYSENTER_32+0x68/0xbb
Feb 20 10:32:04 forest3 kernel: [ 250.474874] EIP: 0xb7f33d09
Feb 20 10:32:04 forest3 kernel: [ 250.474875] EFLAGS: 00000246 CPU: 4
Feb 20 10:32:04 forest3 kernel: [ 250.474876] EAX: ffffffda EBX: 00000004 ECX: 01362e50 EDX: 00008008
Feb 20 10:32:04 forest3 kernel: [ 250.474877] ESI: 00002268 EDI: 004d4f30 EBP: 004d2cb8 ESP: bfcaa2d0
Feb 20 10:32:04 forest3 kernel: [ 250.474878] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b
Feb 20 10:32:04 forest3 kernel: [ 250.474880] ? nmi+0x8b/0x198
Feb 20 10:32:04 forest3 kernel: [ 250.474880] Mem-Info:
Feb 20 10:32:04 forest3 kernel: [ 250.474884] active_anon:4986 inactive_anon:4699 isolated_anon:0
Feb 20 10:32:04 forest3 kernel: [ 250.474884] active_file:7234 inactive_file:446544 isolated_file:0
Feb 20 10:32:04 forest3 kernel: [ 250.474884] unevictable:0 dirty:4 writeback:0 unstable:0
Feb 20 10:32:04 forest3 kernel: [ 250.474884] slab_reclaimable:11170 slab_unreclaimable:8935
Feb 20 10:32:04 forest3 kernel: [ 250.474884] mapped:7622 shmem:321 pagetables:375 bounce:0
Feb 20 10:32:04 forest3 kernel: [ 250.474884] free:15807387 free_pcp:355 free_cma:0
Feb 20 10:32:04 forest3 kernel: [ 250.474887] Node 0 active_anon:19944kB inactive_anon:18796kB active_file:28936kB inactive_file:1786176kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:30488kB dirty:16kB writeback:0kB shmem:1284kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? yes
Feb 20 10:32:04 forest3 kernel: [ 250.474889] DMA free:1180kB min:780kB low:972kB high:1164kB active_anon:0kB inactive_anon:0kB active_file:8kB inactive_file:0kB unevictable:0kB writepending:0kB present:15980kB managed:15904kB mlocked:0kB kernel_stack:112kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
Feb 20 10:32:04 forest3 kernel: [ 250.474890] lowmem_reserve[]: 0 106 63687 63687
Feb 20 10:32:04 forest3 kernel: [ 250.474894] Normal free:5316kB min:5340kB low:6672kB high:8004kB active_anon:0kB inactive_anon:0kB active_file:3048kB inactive_file:2956kB unevictable:0kB writepending:0kB present:890872kB managed:163460kB mlocked:0kB kernel_stack:1560kB pagetables:0kB bounce:0kB free_pcp:976kB local_pcp:24kB free_cma:0kB
Feb 20 10:32:04 forest3 kernel: [ 250.474894] lowmem_reserve[]: 0 0 508647 508647
Feb 20 10:32:04 forest3 kernel: [ 250.474898] HighMem free:63223052kB min:512kB low:800216kB high:1599920kB active_anon:19944kB inactive_anon:18796kB active_file:25840kB inactive_file:1783140kB unevictable:0kB writepending:16kB present:65106888kB managed:65106888kB mlocked:0kB kernel_stack:0kB pagetables:1500kB bounce:0kB free_pcp:444kB local_pcp:32kB free_cma:0kB
Feb 20 10:32:04 forest3 kernel: [ 250.474898] lowmem_reserve[]: 0 0 0 0
Feb 20 10:32:04 forest3 kernel: [ 250.474900] DMA: 39*4kB (UME) 28*8kB (UE) 28*16kB (UME) 7*32kB (UM) 2*64kB (ME) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1180kB
Feb 20 10:32:04 forest3 kernel: [ 250.474905] Normal: 287*4kB (UME) 175*8kB (UME) 69*16kB (U) 34*32kB (UME) 9*64kB (UME) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 5316kB
Feb 20 10:32:04 forest3 kernel: [ 250.474910] HighMem: 75*4kB (UM) 58*8kB (U) 43*16kB (UM) 19*32kB (UM) 4*64kB (UM) 2*128kB (UM) 1*256kB (M) 3*512kB (UM) 3*1024kB (UM) 25*2048kB (UM) 15421*4096kB (UM) = 63223052kB
Feb 20 10:32:04 forest3 kernel: [ 250.474917] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Feb 20 10:32:04 forest3 kernel: [ 250.474918] 454111 total pagecache pages
Feb 20 10:32:04 forest3 kernel: [ 250.474928] 10 pages in swap cache
Feb 20 10:32:04 forest3 kernel: [ 250.474929] Swap cache stats: add 133, delete 123, find 0/0
Feb 20 10:32:04 forest3 kernel: [ 250.474930] Free swap = 67108084kB
Feb 20 10:32:04 forest3 kernel: [ 250.474930] Total swap = 67108860kB
Feb 20 10:32:04 forest3 kernel: [ 250.474931] 16503435 pages RAM
Feb 20 10:32:04 forest3 kernel: [ 250.474932] 16276722 pages HighMem/MovableOnly
Feb 20 10:32:04 forest3 kernel: [ 250.474932] 181872 pages reserved
Feb 20 10:32:04 forest3 kernel: [ 250.474933] 0 pages cma reserved
Feb 20 10:32:04 forest3 kernel: [ 250.474933] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
Feb 20 10:32:04 forest3 kernel: [ 250.474938] [ 489] 0 489 8631 2185 73728 0 0 systemd-journal
Feb 20 10:32:04 forest3 kernel: [ 250.474940] [ 508] 0 508 3517 452 49152 0 0 lvmetad
Feb 20 10:32:04 forest3 kernel: [ 250.474941] [ 516] 0 516 4432 1140 49152 0 -1000 systemd-udevd
Feb 20 10:32:04 forest3 kernel: [ 250.474943] [ 787] 62583 787 4799 730 61440 0 0 systemd-timesyn
Feb 20 10:32:04 forest3 kernel: [ 250.474944] [ 812] 100 812 2952 999 53248 0 0 systemd-network
Feb 20 10:32:04 forest3 kernel: [ 250.474945] [ 864] 101 864 2607 996 53248 0 0 systemd-resolve
Feb 20 10:32:04 forest3 kernel: [ 250.474947] [ 1001] 0 1001 8256 3310 86016 34 0 networkd-dispat
Feb 20 10:32:04 forest3 kernel: [ 250.474949] [ 1009] 102 1009 6467 985 61440 0 0 rsyslogd
Feb 20 10:32:04 forest3 kernel: [ 250.474950] [ 1014] 0 1014 10054 1591 77824 0 0 accounts-daemon
Feb 20 10:32:04 forest3 kernel: [ 250.474951] [ 1018] 103 1018 1702 908 49152 0 -900 dbus-daemon
Feb 20 10:32:04 forest3 kernel: [ 250.474953] [ 1085] 0 1085 934 498 40960 0 0 atd
Feb 20 10:32:04 forest3 kernel: [ 250.474954] [ 1106] 0 1106 1458 693 45056 0 0 cron
Feb 20 10:32:04 forest3 kernel: [ 250.474955] [ 1108] 0 1108 5139 356 49152 0 0 lxcfs
Feb 20 10:32:04 forest3 kernel: [ 250.474956] [ 1114] 0 1114 2591 1264 57344 0 0 systemd-logind
Feb 20 10:32:04 forest3 kernel: [ 250.474958] [ 1118] 0 1118 4155 761 53248 0 0 irqbalance
Feb 20 10:32:04 forest3 kernel: [ 250.474959] [ 1132] 0 1132 9559 1500 73728 0 0 polkitd
Feb 20 10:32:04 forest3 kernel: [ 250.474960] [ 1162] 0 1162 9389 3967 94208 48 0 unattended-upgr
Feb 20 10:32:04 forest3 kernel: [ 250.474961] [ 1421] 0 1421 2639 1197 53248 10 -1000 sshd
Feb 20 10:32:04 forest3 kernel: [ 250.474963] [ 1468] 0 1468 1296 858 45056 3 0 login
Feb 20 10:32:04 forest3 kernel: [ 250.474964] [ 1776] 1000 1776 3098 1613 57344 0 0 systemd
Feb 20 10:32:04 forest3 kernel: [ 250.474966] [ 1787] 1000 1787 3596 416 57344 4 0 (sd-pam)
Feb 20 10:32:04 forest3 kernel: [ 250.474967] [ 1809] 1000 1809 1730 1018 49152 0 0 bash
Feb 20 10:32:04 forest3 kernel: [ 250.474968] [ 1819] 0 1819 1862 971 53248 0 0 sudo
Feb 20 10:32:04 forest3 kernel: [ 250.474969] [ 1821] 0 1821 1725 830 49152 0 0 su
Feb 20 10:32:04 forest3 kernel: [ 250.474971] [ 1822] 0 1822 1730 1032 45056 0 0 bash
Feb 20 10:32:04 forest3 kernel: [ 250.474972] [ 1837] 0 1837 2891 1534 57344 0 0 sshd
Feb 20 10:32:04 forest3 kernel: [ 250.474973] [ 1929] 1000 1929 2891 847 53248 0 0 sshd
Feb 20 10:32:04 forest3 kernel: [ 250.474974] [ 1930] 1000 1930 1729 974 45056 0 0 bash
Feb 20 10:32:04 forest3 kernel: [ 250.474975] [ 1942] 0 1942 1862 968 49152 0 0 sudo
Feb 20 10:32:04 forest3 kernel: [ 250.474977] [ 1943] 0 1943 1725 868 49152 0 0 su
Feb 20 10:32:04 forest3 kernel: [ 250.474978] [ 1944] 0 1944 1762 1070 45056 0 0 bash
Feb 20 10:32:04 forest3 kernel: [ 250.474979] [ 1982] 0 1982 1122 150 45056 0 0 tail
Feb 20 10:32:04 forest3 kernel: [ 250.474980] [ 1986] 0 1986 9403 1102 114688 0 0 rsync
Feb 20 10:32:04 forest3 kernel: [ 250.474982] [ 1987] 0 1987 9303 747 106496 0 0 rsync
Feb 20 10:32:04 forest3 kernel: [ 250.474983] [ 1988] 0 1988 9238 587 110592 0 0 rsync
Feb 20 10:32:04 forest3 kernel: [ 250.474984] [ 1992] 0 1992 2891 1561 57344 0 0 sshd
Feb 20 10:32:04 forest3 kernel: [ 250.474986] [ 2079] 1000 2079 2891 739 57344 0 0 sshd
Feb 20 10:32:04 forest3 kernel: [ 250.474987] [ 2080] 1000 2080 1729 1057 49152 0 0 bash
Feb 20 10:32:04 forest3 kernel: [ 250.474988] Out of memory: Kill process 1162 (unattended-upgr) score 0 or sacrifice child
Feb 20 10:32:04 forest3 kernel: [ 250.475042] Killed process 1162 (unattended-upgr) total-vm:37556kB, anon-rss:5016kB, file-rss:10852kB, shmem-rss:0kB
Feb 20 10:32:04 forest3 kernel: [ 250.492223] oom_reaper: reaped process 1162 (unattended-upgr), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
以下是 rsync 运行时 free 显示的内容。还包括 swapon -s 输出。
root@forest3:~# free -h
total used free shared buff/cache available
Mem: 62G 178M 60G 588K 1.7G 59G
Swap: 63G 4.3M 63G
root@forest3:~#
root@forest3:~# swapon -s
Filename Type Size Used Priority
/users2/swapfile file 67108860 4664 -2
root@forest3:~#
我添加了 /users2/swapfile,并在其后附加了一系列“dd”命令。我通过一系列命令执行此操作,希望在创建新交换文件时不会终止该进程。旧的原始交换文件是默认的 2G 文件。创建新文件后,我关闭了交换文件,在新文件上运行了 mkswap,然后重新打开交换文件并指向新文件。我还更新了 /etc/fstab 以使用新文件,这样它就可以在重启后继续使用。
为了测试“新安装选项...”消息的来源,我尝试卸载 /backups 并重新安装它。然后关闭交换并重新打开。以下是这些操作添加的 kern.log 消息:
Feb 20 12:54:32 forest3 kernel: [ 4289.790003] EXT4-fs (sdc): mounted filesystem with ordered data mode. Opts: (null)
Feb 20 12:54:45 forest3 kernel: [ 4302.387143] Adding 67108860k swap on /users2/swapfile. Priority:-2 extents:43 across:74950140k FS
我也读过一些关于 vm.overcommit 的帖子,并尝试更改默认值,情况似乎有所好转,但仍然会出现很多 OOM 杀手事件。这是我的 /etc/sysctl.d/10-no-overcommit.conf
## default
#vm.overcommit_memory = 0
#vm.overcommit_ratio = 50
# Try:
vm.overcommit_memory = 2
vm.overcommit_ratio = 100
我目前的状态是,我有 4 台机器正在试验。每台机器都是相同的机箱。每台机器都有相同的 CPU 和内存。forest2 有 4 个 1.2TB 磁盘。其他的有 2 个 1.2TB 磁盘。所有机器都运行内核 4.15.0-135-generic。升级 BIOS 后,我使用 rsync 测试重新测试了所有具有新 BIOS 的机器。
姓名 | Ubuntu | BIOS | 记忆测试 | 同步测试 | 交换 | 交换大小 | 交换性 | 安装说明 |
---|---|---|---|---|---|---|---|---|
森林2 | 18.04 桌面版 | 老的 | 经过 | 分割 | 64G | 60 | 14.04->16.04->18.04 | |
森林3 | 18.04 服务器 | 新的 | 失败 | 文件:/swapfile | 4G | 10 | 18.04 迷你 | |
森林3' | 18.04 服务器 | 新的 | 经过 | 文件:/swapfile | 8G | 60 | 18.04.5-live-服务器-amd64.iso | |
森林4 | 18.04 服务器 | 新的 | 已通过 | 失败 | 文件:/swapfile | 4G | 60 | 18.04 迷你 |
森林4' | 18.04 服务器 | 新的 | 已通过 | 经过 | 分割 | 8G | 60 | 14.04->16.04->18.04 |
森林5 | 18.04 服务器 | 新的 | 失败 | 分割 | 8G | 60 | 18.04 迷你 |
forest3 和 forest3' 的区别在于 forest3' 是直接从 18.04.5-live-server-amd64.iso 构建的,而不是 18.04 server mini ISO。这台新机器运行良好!
forest4 和 forest4' 之间的区别在于 forest4' 是从 Ubuntu 14.04 服务器安装 CD 开始构建的,然后执行 release-upgrade 到 16.04,然后执行 release-upgrade 到 18.04。
forest2 是从 14.04 桌面安装 DVD 构建的,然后升级到 16.04,然后升级到 18.04。
所有机器都有 64G 内存。最初的问题与 forest3 有关,当时它有 64G 交换空间。从那时起,它已更改为有 4G 交换空间。
答案1
BIOS
戴尔公司 PowerEdge R620/0PXXHP
您拥有 2013 年 9 月 19 日发布的 BIOS 版本 2.1.2。现在有更新的 BIOS 版本 2.9.0(2020 年 2 月 21 日发布),您可以下载这里。
注意:确认我拥有适合您型号的正确网页。
注意:更新 BIOS 之前请做好备份。
/交换文件
64G 交换太疯狂了。
笔记:命令使用不当dd
可能导致数据丢失。建议复制/粘贴。
在里面terminal
...
sudo swapoff -a # turn off swap
sudo rm -i /users2/swapfile # remove old /swapfile
sudo dd if=/dev/zero of=/swapfile bs=1M count=4096
sudo chmod 600 /swapfile # set proper file protections
sudo mkswap /swapfile # init /swapfile
sudo swapon /swapfile # turn on swap
free -h # confirm 64G RAM and 4G swap
确认 /etc/fstab 末尾的 /swapfile 行...并确认没有其他“swap”行...
要编辑,使用sudo -H gedit /etc/fstab
或sudo pico /etc/fstab
/swapfile none swap sw 0 0
reboot # reboot and verify operation
注意:将这些设置回默认值...
## default
#vm.overcommit_memory = 0
#vm.overcommit_ratio = 50
注意:可选...设置vm.swappiness=10
更新#1:
去https://www.memtest86.com/并免费下载/运行它们memtest
来测试你的记忆力。至少完成一次所有 4/4 测试以确认记忆力良好。这可能需要几个小时才能完成。
更新 #2:
所有受影响的服务器都安装了 18.04小型的已安装。通过安装 18.04.5-live-server-amd64.iso,现在一切似乎都正常工作了。
答案2
从 kern.log 来看,
Feb 20 10:32:04 forest3 kernel: [ 250.474988] Out of memory: Kill process 1162 (unattended-upgr) score 0 or sacrifice child
Feb 20 10:32:04 forest3 kernel: [ 250.475042] Killed process 1162 (unattended-upgr) total-vm:37556kB, anon-rss:5016kB, file-rss:10852kB, shmem-rss:0kB
Feb 20 10:32:04 forest3 kernel: [ 250.492223] oom_reaper: reaped process 1162 (unattended-upgr), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
我认为这个问题之前已经得到解答了。
http://unix.stackexchange.com/questions/374748/ddg#399531
解决方案是停止无人值守升级和 rsync,直到系统稳定下来。这两者都可能产生高 I/O 和高 CPU 成本,并耗尽内存。