我知道这似乎与这里的许多其他帖子重复,但我不敢苟同。因为这(在我不太有经验的眼睛看来)似乎不是缓存问题。
我们有一台 RAM 为 256G 的服务器,其使用率始终为 95% 到 99%,没有明显的进程使用它。同时,当运行任何占用大量内存的内容时,交换区开始填满,服务器很快就会变得无响应。即使在干净重新启动后,内存也总是立即满。启动进入恢复模式时,内存使用情况正常。
为了给这个问题提供一些线索,这里是输出cat /proc/meminfo
MemTotal: 263702068 kB
MemFree: 655500 kB
MemAvailable: 0 kB
Buffers: 3248 kB
Cached: 70244 kB
SwapCached: 3108 kB
Active: 22584 kB
Inactive: 49740 kB
Active(anon): 8596 kB
Inactive(anon): 19848 kB
Active(file): 13988 kB
Inactive(file): 29892 kB
Unevictable: 134096 kB
Mlocked: 126596 kB
SwapTotal: 2097148 kB
SwapFree: 488472 kB
Dirty: 0 kB
Writeback: 0 kB
AnonPages: 131292 kB
Mapped: 61476 kB
Shmem: 7532 kB
KReclaimable: 122832 kB
Slab: 1580428 kB
SReclaimable: 122832 kB
SUnreclaim: 1457596 kB
KernelStack: 21600 kB
PageTables: 11700 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 6021908 kB
Committed_AS: 2924960 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 502876 kB
VmallocChunk: 0 kB
Percpu: 142912 kB
HardwareCorrupted: 0 kB
AnonHugePages: 0 kB
ShmemHugePages: 0 kB
ShmemPmdMapped: 0 kB
FileHugePages: 0 kB
FilePmdMapped: 0 kB
HugePages_Total: 244
HugePages_Free: 244
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 1048576 kB
Hugetlb: 255852544 kB
DirectMap4k: 967040 kB
DirectMap2M: 8079360 kB
DirectMap1G: 261095424 kB
这是输出free -h
total used free shared buff/cache available
Mem: 251Gi 250Gi 650Mi 7,0Mi 177Mi 614Mi
Swap: 2,0Gi 1,9Gi 82Mi
这是输出slabtop -s c
Active / Total Objects (% used) : 2816030 / 2856667 (98,6%)
Active / Total Slabs (% used) : 62599 / 62599 (100,0%)
Active / Total Caches (% used) : 123 / 183 (67,2%)
Active / Total Size (% used) : 618859,66K / 630011,16K (98,2%)
Minimum / Average / Maximum Object : 0,01K / 0,22K / 12,00K
OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
12120 12120 100% 8,00K 3030 4 96960K kmalloc-8k
513440 513440 100% 0,12K 16045 32 64180K scsi_sense_cache
112192 104705 93% 0,50K 3506 32 56096K kmalloc-512
65484 64946 99% 0,62K 1284 51 41088K inode_cache
37800 37705 99% 0,57K 675 56 21600K radix_tree_node
1998 1985 99% 8,12K 666 3 21312K task_struct
4848 4812 99% 4,00K 606 8 19392K kmalloc-4k
150272 150272 100% 0,12K 4696 32 18784K kernfs_node_cache
96306 84047 87% 0,19K 2293 42 18344K dentry
7872 7466 94% 2,00K 492 16 15744K kmalloc-2k
22274 21817 97% 0,70K 486 46 15552K proc_inode_cache
15232 15029 98% 1,00K 476 32 15232K kmalloc-1k
152838 152838 100% 0,09K 3639 42 14556K kmalloc-96
68308 68160 99% 0,20K 1752 39 14016K vm_area_struct
12558 12248 97% 0,81K 322 39 10304K sock_inode_cache
130704 130704 100% 0,07K 2334 56 9336K Acpi-Operand
2320 2280 98% 4,00K 290 8 9280K biovec-max
3855 3855 100% 2,06K 257 15 8224K sighand_cache
31648 30708 97% 0,25K 989 32 7912K filp
125952 123874 98% 0,06K 1968 64 7872K kmalloc-64
6129 5643 92% 1,15K 227 27 7264K ext4_inode_cache
135150 133752 98% 0,05K 1590 85 6360K ftrace_event_field
12352 12192 98% 0,50K 386 32 6176K skbuff_fclone_cache
5348 5348 100% 1,12K 191 28 6112K signal_cache
92800 92800 100% 0,06K 1450 64 5800K anon_vma_chain
49452 49273 99% 0,10K 1268 39 5072K anon_vma
157184 156203 99% 0,03K 1228 128 4912K kmalloc-32
19424 17992 92% 0,25K 607 32 4856K kmalloc-256
6106 6106 100% 0,74K 142 43 4544K shmem_inode_cache
4260 4260 100% 1,00K 134 32 4288K kmalloc-cg-1k
3780 3780 100% 1,06K 126 30 4032K mm_struct
5566 5566 100% 0,69K 121 46 3872K files_cache
944 944 100% 4,00K 118 8 3776K kmalloc-cg-4k
14656 14656 100% 0,25K 458 32 3664K pool_workqueue
2912 2912 100% 1,19K 112 26 3584K perf_event
896 896 100% 4,00K 112 8 3584K names_cache
3520 3520 100% 1,00K 110 32 3520K biovec-64
1744 1744 100% 2,00K 109 16 3488K kmalloc-cg-2k
1728 1728 100% 2,00K 108 16 3456K biovec-128
3240 3240 100% 1,06K 108 30 3456K UNIX
2808 2626 93% 1,19K 108 26 3456K RAWv6
17388 17209 98% 0,19K 414 42 3312K kmalloc-192
4743 4743 100% 0,62K 93 51 2976K task_group
28743 28229 98% 0,10K 737 39 2948K buffer_head
2944 2688 91% 1,00K 92 32 2944K RAW
10208 10089 98% 0,25K 319 32 2552K skbuff_head_cache
39808 36968 92% 0,06K 622 64 2488K vmap_area
1008 1008 100% 2,19K 72 14 2304K TCP
3840 3840 100% 0,50K 120 32 1920K kmalloc-cg-512
任何帮助深表感谢。
编辑1:包括输出smem -tw
Area Used Cache Noncache
firmware/hardware 0 0 0
kernel image 0 0 0
kernel dynamic memory 262358340 199108 262159232
userspace memory 322752 173264 149488
free memory 1020956 1020956 0
----------------------------------------------------------
263702048 1393328 262308720
top -o VIRT -b -n 1
编辑 2:由于帖子太长,因此在下面的链接中包含了 的输出 。 (也许这为如何诊断这个问题提供了明确的线索)
编辑3:添加输出ipcs -ma
------ Message Queues --------
key msqid owner perms used-bytes messages
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
------ Semaphore Arrays --------
key semid owner perms nsems