Java 11 应用程序崩溃了,据我所知,在我现有的设置下这是不可能的。
有问题的应用程序在 Amazon Linux 2 上运行,使用 Java 11。服务器是具有 4 GB RAM 的云 EC2。服务器没有交换空间。
该服务器纯粹专用于此应用程序,除了应用程序、应用程序所需的东西(例如 ngnix)以及监控应用程序的东西之外,不应运行任何其他东西。
此外,应用程序启动时 Xmx 和 Xms 参数设置为相同的值:2136M,因此除了启动时,JVM 不应该向操作系统请求内存。
应用程序在标准负载下通常使用大约 250MB 内存,在“异常高”负载下则使用大约 400-500M 内存。这包括 servlet 容器等的开销。分配超过 2GB 的内存是为了在发生 DDOS 攻击时提供一个小的缓冲区。
应用程序运行一段时间后会崩溃,通常至少 24 小时。
据我所知,这种崩溃应该不可能发生,因为 Xmx 和 Xms 设置相同,因此 JVM 启动后不应该向操作系统请求更多内存。
以下是 err_pid__#.log 的一些摘录
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 65536 bytes for committing reserved memory.
# Possible reasons:
# The system is out of physical RAM or swap space
# The process is running with CompressedOops enabled, and the Java Heap may be blocking the growth of the native heap
# Possible solutions:
# Reduce memory load on the system
# Increase physical memory or swap space
# Check if swap backing store is full
# Decrease Java heap size (-Xmx/-Xms)
# Decrease number of Java threads
# Decrease Java thread stack sizes (-Xss)
# Set larger code cache with -XX:ReservedCodeCacheSize=
# JVM is running with Zero Based Compressed Oops mode in which the Java heap is
# placed in the first 32GB address space. The Java Heap base address is the
# maximum limit for the native heap growth. Please use -XX:HeapBaseMinAddress
# to set the Java Heap base and to place the Java Heap above 32GB virtual address.
# This output file may be truncated or incomplete.
#
# Out of Memory Error (os_linux.cpp:2709), pid=2100, tid=2113
#
# JRE version: OpenJDK Runtime Environment (11.0+28) (build 11+28)
# Java VM: OpenJDK 64-Bit Server VM (11+28, mixed mode, tiered, compressed oops, g1 gc, linux-amd64)
# No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
--------------- S U M M A R Y ------------
Command Line: -Xmx2136M -Xms2136M -javaagent:/opt/jetty/newrelic/newrelic.jar -Djetty.home=/opt/jetty -Djetty.base=/opt/jetty-base -Djava.io.tmpdir=/tmp /opt/jetty/start.jar jetty.state=/opt/jetty-base/jetty.state jetty-started.xml start-log-file=/{REDACTED}.log
Host: Intel(R) Xeon(R) Platinum 8175M CPU @ 2.50GHz, 2 cores, 3G, Amazon Linux release 2 (Karoo)
Time: Tue Apr 9 16:22:38 2019 UTC elapsed time: 132340 seconds (1d 12h 45m 40s)
--------------- T H R E A D ---------------
Current thread (0x00007f6f4884c800): JavaThread "C2 CompilerThread0" daemon [_thread_in_vm, id=2113, stack(0x00007f6f01330000,0x00007f6f01431000)]
Current CompileTask:
C2:132341001 41817 4 com.newrelic.agent.deps.org.apache.http.impl.client.HttpClientBuilder::build (1754 bytes)
--------------- S Y S T E M ---------------
OS:Amazon Linux release 2 (Karoo)
uname:Linux 4.14.104-95.84.amzn2.x86_64 #1 SMP Sat Mar 2 00:40:20 UTC 2019 x86_64
libc:glibc 2.26 NPTL 2.26
rlimit: STACK 8192k, CORE 0k, NPROC 4096, NOFILE 4096, AS infinity, DATA infinity, FSIZE infinity
load average:0.00 0.03 0.00
/proc/meminfo:
MemTotal: 3978224 kB
MemFree: 103460 kB
MemAvailable: 0 kB
Buffers: 0 kB
Cached: 3744 kB
SwapCached: 0 kB
Active: 3795996 kB
Inactive: 2028 kB
Active(anon): 3794792 kB
Inactive(anon): 224 kB
Active(file): 1204 kB
Inactive(file): 1804 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 36 kB
Writeback: 0 kB
AnonPages: 3794564 kB
Mapped: 2504 kB
Shmem: 452 kB
Slab: 32256 kB
SReclaimable: 13572 kB
SUnreclaim: 18684 kB
KernelStack: 3104 kB
PageTables: 12828 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 1989112 kB
Committed_AS: 2910992 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 0 kB
VmallocChunk: 0 kB
AnonHugePages: 0 kB
ShmemHugePages: 0 kB
ShmemPmdMapped: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 126952 kB
DirectMap2M: 4005888 kB
DirectMap1G: 0 kB
/proc/sys/kernel/threads-max (system-wide limit on the number of threads):
30799
/proc/sys/vm/max_map_count (maximum number of memory map areas a process may have):
65530
/proc/sys/kernel/pid_max (system-wide limit on number of process identifiers):
32768
container (cgroup) information:
container_type: cgroupv1
cpu_cpuset_cpus: 0-1
cpu_memory_nodes: 0
active_processor_count: 2
cpu_quota: -1
cpu_period: 100000
cpu_shares: -1
memory_limit_in_bytes: -1
memory_and_swap_limit_in_bytes: -1
memory_soft_limit_in_bytes: -1
memory_usage_in_bytes: 3889819648
memory_max_usage_in_bytes: 0
CPU:total 2 (initial active 2) (2 cores per cpu, 2 threads per core) family 6 model 85 stepping 4, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx, fma
CPU Model and flags from /proc/cpuinfo:
model name : Intel(R) Xeon(R) Platinum 8175M CPU @ 2.50GHz
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves ida arat pku ospke
Memory: 4k page, physical 3978224k(103460k free), swap 0k(0k free)
vm_info: OpenJDK 64-Bit Server VM (11+28) for linux-amd64 JRE (11+28), built on Aug 22 2018 18:55:06 by "mach5one" with gcc 7.3.0
END.
答案1
在充分分析了事情经过后,
这里的关键是-Xmx
和-Xms
被设置为相同的值,这意味着 JVM 不会分配更多堆栈。因此失败一定是因为分配了标准堆栈内存以外的其他内存。
似乎有一个 Java 库 (Conscrypt) 使用本机方法。其中一个本机方法存在内存泄漏。
由于内存泄漏发生在本机方法中,因此不受-Xmx
和-Xms
设置的约束。
答案2
mmap 失败意味着 (Linux) 内核无法分配内存。通常处于内存不足状态。这并不意味着 (JVM) 进程超出了其内存限制。
应用程序所需的东西(例如 ngnix)以及监视应用程序的东西
这些使用了多少内存?您可以通过将它们隔离在各自的 cgroup(容器、systemd 切片)中来精确测量。
JVM 错误为您提供了解决方案列表,请执行它们。
“减少内存负载”意味着查看主机上的所有内容以评估内存消耗。/proc/meminfo 确实显示近 4 GB 中的大部分是活动匿名页面。绝对不是 2 GB 可用。
Java 堆选项列得较早,因为它们往往是最大且最常调整的。您已经看过它们了,但请考虑减少堆以在此主机上为其他内容腾出空间。
Java 的其余调优有点奇怪。但值得记住的是,并非所有 Java 内存使用都是堆。
如果您现在不想花时间进行优化,那么也许只需投入更多内存即可。