Linux 吃掉了我的 RAM

Linux 吃掉了我的 RAM

我知道这似乎与这里的许多其他帖子重复,但我不敢苟同。因为这(在我不太有经验的眼睛看来)似乎不是缓存问题。

我们有一台 RAM 为 256G 的服务器,其使用率始终为 95% 到 99%,没有明显的进程使用它。同时,当运行任何占用大量内存的内容时,交换区开始填满,服务器很快就会变得无响应。即使在干净重新启动后,内存也总是立即满。启动进入恢复模式时,内存使用情况正常。

为了给这个问题提供一些线索,这里是输出cat /proc/meminfo

MemTotal:       263702068 kB
MemFree:          655500 kB
MemAvailable:          0 kB
Buffers:            3248 kB
Cached:            70244 kB
SwapCached:         3108 kB
Active:            22584 kB
Inactive:          49740 kB
Active(anon):       8596 kB
Inactive(anon):    19848 kB
Active(file):      13988 kB
Inactive(file):    29892 kB
Unevictable:      134096 kB
Mlocked:          126596 kB
SwapTotal:       2097148 kB
SwapFree:         488472 kB
Dirty:                 0 kB
Writeback:             0 kB
AnonPages:        131292 kB
Mapped:            61476 kB
Shmem:              7532 kB
KReclaimable:     122832 kB
Slab:            1580428 kB
SReclaimable:     122832 kB
SUnreclaim:      1457596 kB
KernelStack:       21600 kB
PageTables:        11700 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     6021908 kB
Committed_AS:    2924960 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      502876 kB
VmallocChunk:          0 kB
Percpu:           142912 kB
HardwareCorrupted:     0 kB
AnonHugePages:         0 kB
ShmemHugePages:        0 kB
ShmemPmdMapped:        0 kB
FileHugePages:         0 kB
FilePmdMapped:         0 kB
HugePages_Total:     244
HugePages_Free:      244
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:    1048576 kB
Hugetlb:        255852544 kB
DirectMap4k:      967040 kB
DirectMap2M:     8079360 kB
DirectMap1G:    261095424 kB

这是输出free -h

              total        used        free      shared  buff/cache   available
Mem:          251Gi       250Gi       650Mi       7,0Mi       177Mi       614Mi
Swap:         2,0Gi       1,9Gi        82Mi

这是输出slabtop -s c

 Active / Total Objects (% used)    : 2816030 / 2856667 (98,6%)
 Active / Total Slabs (% used)      : 62599 / 62599 (100,0%)
 Active / Total Caches (% used)     : 123 / 183 (67,2%)
 Active / Total Size (% used)       : 618859,66K / 630011,16K (98,2%)
 Minimum / Average / Maximum Object : 0,01K / 0,22K / 12,00K

  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
 12120  12120 100%    8,00K   3030        4     96960K kmalloc-8k
513440 513440 100%    0,12K  16045       32     64180K scsi_sense_cache
112192 104705  93%    0,50K   3506       32     56096K kmalloc-512
 65484  64946  99%    0,62K   1284       51     41088K inode_cache
 37800  37705  99%    0,57K    675       56     21600K radix_tree_node
  1998   1985  99%    8,12K    666        3     21312K task_struct
  4848   4812  99%    4,00K    606        8     19392K kmalloc-4k
150272 150272 100%    0,12K   4696       32     18784K kernfs_node_cache
 96306  84047  87%    0,19K   2293       42     18344K dentry
  7872   7466  94%    2,00K    492       16     15744K kmalloc-2k
 22274  21817  97%    0,70K    486       46     15552K proc_inode_cache
 15232  15029  98%    1,00K    476       32     15232K kmalloc-1k
152838 152838 100%    0,09K   3639       42     14556K kmalloc-96
 68308  68160  99%    0,20K   1752       39     14016K vm_area_struct
 12558  12248  97%    0,81K    322       39     10304K sock_inode_cache
130704 130704 100%    0,07K   2334       56      9336K Acpi-Operand
  2320   2280  98%    4,00K    290        8      9280K biovec-max
  3855   3855 100%    2,06K    257       15      8224K sighand_cache
 31648  30708  97%    0,25K    989       32      7912K filp
125952 123874  98%    0,06K   1968       64      7872K kmalloc-64
  6129   5643  92%    1,15K    227       27      7264K ext4_inode_cache
135150 133752  98%    0,05K   1590       85      6360K ftrace_event_field
 12352  12192  98%    0,50K    386       32      6176K skbuff_fclone_cache
  5348   5348 100%    1,12K    191       28      6112K signal_cache
 92800  92800 100%    0,06K   1450       64      5800K anon_vma_chain
 49452  49273  99%    0,10K   1268       39      5072K anon_vma
157184 156203  99%    0,03K   1228      128      4912K kmalloc-32
 19424  17992  92%    0,25K    607       32      4856K kmalloc-256
  6106   6106 100%    0,74K    142       43      4544K shmem_inode_cache
  4260   4260 100%    1,00K    134       32      4288K kmalloc-cg-1k
  3780   3780 100%    1,06K    126       30      4032K mm_struct
  5566   5566 100%    0,69K    121       46      3872K files_cache
   944    944 100%    4,00K    118        8      3776K kmalloc-cg-4k
 14656  14656 100%    0,25K    458       32      3664K pool_workqueue
  2912   2912 100%    1,19K    112       26      3584K perf_event
   896    896 100%    4,00K    112        8      3584K names_cache
  3520   3520 100%    1,00K    110       32      3520K biovec-64
  1744   1744 100%    2,00K    109       16      3488K kmalloc-cg-2k
  1728   1728 100%    2,00K    108       16      3456K biovec-128
  3240   3240 100%    1,06K    108       30      3456K UNIX
  2808   2626  93%    1,19K    108       26      3456K RAWv6
 17388  17209  98%    0,19K    414       42      3312K kmalloc-192
  4743   4743 100%    0,62K     93       51      2976K task_group
 28743  28229  98%    0,10K    737       39      2948K buffer_head
  2944   2688  91%    1,00K     92       32      2944K RAW
 10208  10089  98%    0,25K    319       32      2552K skbuff_head_cache
 39808  36968  92%    0,06K    622       64      2488K vmap_area
  1008   1008 100%    2,19K     72       14      2304K TCP
  3840   3840 100%    0,50K    120       32      1920K kmalloc-cg-512

任何帮助深表感谢。

编辑1:包括输出smem -tw

Area                           Used      Cache   Noncache
firmware/hardware                 0          0          0
kernel image                      0          0          0
kernel dynamic memory     262358340     199108  262159232
userspace memory             322752     173264     149488
free memory                 1020956    1020956          0
----------------------------------------------------------
                          263702048    1393328  262308720

top -o VIRT -b -n 1编辑 2:由于帖子太长,因此在下面的链接中包含了 的输出 。 (也许这为如何诊断这个问题提供了明确的线索)

Dropbox 上的 Top.txt

编辑3:添加输出ipcs -ma

------ Message Queues --------
key        msqid      owner      perms      used-bytes   messages

------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status

------ Semaphore Arrays --------
key        semid      owner      perms      nsems

答案1

您分配了 244 个大页 ( HugePages_Totalin /proc/meminfo),每页大小为 1 GB ( Hugepagesize),总计 244 GB ( Hugetlb)。这些页面被排除在正常的内存分配之外。检查/proc/sys/vm/nr_hugepages或内核引导命令行参数hugepages。看文档这里

相关内容