我的服务器半夜因为这个原因死机了好几次。apt-upgrade 需要多少内存?合理吗?我正在尝试解读这里的日志。任何帮助或建议都非常感谢!
具有 1GB 内存的 Amazon EC2 服务器。Ubuntu 18.04.5 LTS (Bionic Beaver)
Mar 25 06:46:01 ip-172-31-30-204 systemd[1]: Starting Daily apt upgrade and clean activities...
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631907] snapd invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=-900
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631911] CPU: 1 PID: 31093 Comm: snapd Not tainted 5.3.0-1032-aws #34~18.04.2-Ubuntu
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631912] Hardware name: Amazon EC2 t3.micro/, BIOS 1.0 10/16/2017
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631912] Call Trace:
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631921] dump_stack+0x6d/0x95
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631925] dump_header+0x4f/0x200
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631927] oom_kill_process+0xe6/0x120
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631929] out_of_memory+0x109/0x510
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631931] __alloc_pages_slowpath+0xad1/0xe10
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631934] ? __switch_to_asm+0x34/0x70
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631936] ? __switch_to_asm+0x40/0x70
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631938] __alloc_pages_nodemask+0x2cd/0x320
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631942] alloc_pages_current+0x6a/0xe0
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631944] __page_cache_alloc+0x6a/0xa0
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631946] pagecache_get_page+0x9c/0x2b0
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631947] filemap_fault+0x66d/0xb60
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631950] ? page_add_file_rmap+0x5e/0x150
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631952] ? alloc_set_pte+0x113/0x5f0
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631956] ? xas_load+0xc/0x80
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631957] ? xas_find+0x16f/0x1b0
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631959] ? filemap_map_pages+0x18f/0x380
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631961] __do_fault+0x57/0x110
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631963] __handle_mm_fault+0xdd8/0x1260
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631965] handle_mm_fault+0xcb/0x210
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631969] __do_page_fault+0x2a1/0x4d0
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631971] do_page_fault+0x2c/0xe0
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631974] do_async_page_fault+0x54/0x70
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631977] async_page_fault+0x34/0x40
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631979] RIP: 0033:0x55cf079179d8
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631983] Code: Bad RIP value.
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631984] RSP: 002b:000000c420063b60 EFLAGS: 00010283
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631986] RAX: 000055cf08c6f5a0 RBX: 0000000000000000 RCX: 000055cf0869de80
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631987] RDX: 0000000000000018 RSI: 000000c420606068 RDI: 000000c420606078
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631988] RBP: 000000c420063b80 R08: 000000c420063b68 R09: 00000000000000ff
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631989] R10: 0000000000000001 R11: 000000c4204f9440 R12: 0000000000000001
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631990] R13: 0000000000000040 R14: 00000000000000d9 R15: 0000000000000000
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631992] Mem-Info:
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631995] active_anon:199193 inactive_anon:41 isolated_anon:0
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631995] active_file:39 inactive_file:31 isolated_file:0
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631995] unevictable:0 dirty:2 writeback:0 unstable:0
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631995] slab_reclaimable:8066 slab_unreclaimable:12259
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631995] mapped:13 shmem:190 pagetables:1921 bounce:0
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631995] free:12108 free_pcp:452 free_cma:0
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631998] Node 0 active_anon:796772kB inactive_anon:164kB active_file:156kB inactive_file:124kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:52kB dirty:8kB writeback:0kB shmem:760kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.631999] Node 0 DMA free:4340kB min:756kB low:944kB high:1132kB active_anon:11088kB inactive_anon:8kB active_file:64kB inactive_file:8kB unevictable:0kB writepending:0kB present:15992kB managed:15908kB mlocked:0kB kernel_stack:0kB pagetables:48kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632003] lowmem_reserve[]: 0 909 909 909 909
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632005] Node 0 DMA32 free:44092kB min:44296kB low:55368kB high:66440kB active_anon:785684kB inactive_anon:156kB active_file:52kB inactive_file:628kB unevictable:0kB writepending:8kB present:1003496kB managed:960336kB mlocked:0kB kernel_stack:4208kB pagetables:7636kB bounce:0kB free_pcp:1808kB local_pcp:396kB free_cma:0kB
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632009] lowmem_reserve[]: 0 0 0 0 0
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632011] Node 0 DMA: 33*4kB (UME) 25*8kB (UME) 13*16kB (UME) 15*32kB (UME) 13*64kB (UME) 4*128kB (UE) 4*256kB (UME) 2*512kB (UM) 0*1024kB 0*2048kB 0*4096kB = 4412kB
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632019] Node 0 DMA32: 1212*4kB (UME) 689*8kB (UME) 764*16kB (UME) 317*32kB (UME) 133*64kB (UME) 25*128kB (UME) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 44440kB
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632027] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632028] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632029] 285 total pagecache pages
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632030] 0 pages in swap cache
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632031] Swap cache stats: add 0, delete 0, find 0/0
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632032] Free swap = 0kB
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632032] Total swap = 0kB
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632033] 254872 pages RAM
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632034] 0 pages HighMem/MovableOnly
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632034] 10811 pages reserved
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632035] 0 pages cma reserved
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632035] 0 pages hwpoisoned
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632036] Tasks state (memory values in pages):
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632036] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632042] [ 389] 0 389 32672 2976 282624 0 0 systemd-journal
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632045] [ 408] 0 408 11118 584 118784 0 -1000 systemd-udevd
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632047] [ 432] 0 432 26475 60 98304 0 0 lvmetad
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632049] [ 554] 62583 554 35488 160 184320 0 0 systemd-timesyn
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632051] [ 713] 100 713 20051 187 180224 0 0 systemd-network
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632053] [ 747] 101 747 17696 183 180224 0 0 systemd-resolve
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632054] [ 890] 0 890 17845 393 180224 0 0 systemd-logind
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632056] [ 896] 102 896 66816 448 167936 0 0 rsyslogd
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632058] [ 897] 0 897 27603 82 118784 0 0 irqbalance
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632060] [ 898] 103 898 12580 234 139264 0 -900 dbus-daemon
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632062] [ 919] 0 919 1137 15 61440 0 0 acpid
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632064] [ 922] 0 922 7082 52 106496 0 0 atd
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632065] [ 923] 0 923 7961 75 102400 0 0 cron
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632067] [ 924] 0 924 206403 286 163840 0 0 lxcfs
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632069] [ 927] 0 927 42771 1978 233472 0 0 networkd-dispat
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632071] [ 931] 0 931 18074 190 180224 0 -1000 sshd
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632073] [ 937] 0 937 72220 242 212992 0 0 polkitd
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632074] [ 942] 0 942 46921 1979 253952 0 0 unattended-upgr
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632076] [ 947] 0 947 4102 37 77824 0 0 agetty
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632078] [ 950] 0 950 3721 33 65536 0 0 agetty
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632080] [ 8489] 0 8489 19170 278 188416 0 0 systemd
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632082] [ 8490] 0 8490 47880 618 249856 0 0 (sd-pam)
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632084] [ 21005] 106 21005 7148 45 98304 0 0 uuidd
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632086] [ 22103] 0 22103 71999 237 192512 0 0 accounts-daemon
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632088] [ 12048] 0 12048 217770 2487 229376 0 0 amazon-ssm-agen
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632090] [ 12165] 0 12165 221541 4204 262144 0 0 ssm-agent-worke
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632091] [ 15998] 0 15998 26998 262 245760 0 0 sshd
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632093] [ 30993] 0 30993 215444 4093 262144 0 -900 snapd
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632095] [ 32272] 0 32272 8069 198 94208 0 0 screen
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632097] [ 32274] 0 32274 621975 10325 376832 0 0 java
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632099] [ 32306] 0 32306 726790 135577 1568768 0 0 java
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632101] [ 6081] 0 6081 18075 183 184320 0 0 sshd
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632103] [ 6082] 109 6082 18075 188 176128 0 0 sshd
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632104] [ 6086] 0 6086 1156 17 53248 0 0 apt.systemd.dai
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632106] [ 6101] 0 6101 1156 35 61440 0 0 apt.systemd.dai
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632108] [ 6134] 0 6134 88545 10780 462848 0 0 unattended-upgr
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632109] [ 6153] 0 6153 88545 10796 446464 0 0 unattended-upgr
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632111] [ 6171] 0 6171 22038 15667 221184 0 0 dpkg
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632113] [ 6179] 0 6179 5346 45 81920 0 0 dpkg-deb
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632114] [ 6180] 0 6180 5346 45 77824 0 0 dpkg-deb
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632116] [ 6181] 0 6181 7395 979 94208 0 0 dpkg-deb
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632117] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice,task=java,pid=32306,uid=0
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.632288] Out of memory: Killed process 32306 (java) total-vm:2907160kB, anon-rss:542308kB, file-rss:0kB, shmem-rss:0kB
Mar 25 06:46:32 ip-172-31-30-204 kernel: [17858349.681431] oom_reaper: reaped process 32306 (java), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
Mar 25 06:46:51 ip-172-31-30-204 systemd[1]: Started Daily apt upgrade and clean activities.
答案1
我对 Amazon EC2 不是很了解,但我在运行 Jira(用 Java 编写)的 DigitalOcean 节点上也遇到了类似的问题。
Java(Python、Ada、JavaScript、C#……)等语言的一个主要问题是它们使用所谓的垃圾收集。因此,你永远不需要释放对象,它释放自身一旦所有对它的引用切断。
一个问题可能是垃圾收集器不按要求频繁运行,以保持内存量尽可能低。即,如果服务器繁忙,它将在尝试收集内存之前尝试处理请求。这是启动 Java 程序时可以设置的参数之一。
我使用的其他选项记忆是-Xms
等,它们告诉 Java 允许分配多少内存。当所有内存都分配完毕后,垃圾收集器将被迫运行,以尽可能多地挽救内存,并且 Java 仍可能因错误而停止(即,如果您有泄漏或太多连接同时需要超过允许的内存),但并行运行 APT 不应该最终导致您的 Java 应用程序终止。
最后,为了解决我的问题,除了验证我的参数之外,我所做的就是添加交换文件。这会占用一些磁盘空间,但我认为值得的是不要让您的服务一直消失。但是,它不应该被滥用...如果您一直在交换,最好买一台更大的计算机(将 RAM 增加一倍),否则您的系统会变得非常慢。但如果自动 apt 更新需要大约 15 分钟,那么在那时交换一点而不是崩溃是值得的。
另一个解决方案是偶尔重启您的服务。这样,尚未收集的内存将被强制收集。当然,在您重启服务时尝试连接的任何用户都会收到错误(500 或 503)。
注意:在 APT 自动更新发生时,您可能会获得更多点击。如果是这种情况,您可以考虑更改 APT 进程的运行时间。不过,这很烦人,因为它在 cron 上被定义为“每日”,并且所有每日任务都在给定时间一个接一个地运行。您必须将所有这些进程移到一天中的另一个时间,或者更改 APT 更新的设置方式。