我们的监控工具定期报告运行 ZFS 的其中一台服务器内存不足。
# free -g
total used free shared buff/cache available
Mem: 125 84 39 0 2 40
Swap: 49 0 49
经过谷歌搜索,我发现 ZFS 的缓存占用大量内存。但是它“仅”使用了 44.9 GB(或者我读错 arc_summary 了:
# arc_summary
...
ARC size (current): 85.2 % 53.5 GiB
Target size (adaptive): 85.8 % 53.9 GiB
Min size (hard limit): 6.2 % 3.9 GiB
Max size (high water): 16:1 62.8 GiB
...
所以:
125 GB Memory Total
- 40 GB Available
- 54 GB ZFS Cache
- 6 GB Processes displayed by ps (vsz)
=====
25 GB ????
根据要求:(cat /proc/meminfo):
MemTotal: 131665836 kB
MemFree: 41276812 kB
MemAvailable: 42553236 kB
Buffers: 0 kB
Cached: 77304 kB
SwapCached: 24236 kB
Active: 60604 kB
Inactive: 83652 kB
Active(anon): 12712 kB
Inactive(anon): 66792 kB
Active(file): 47892 kB
Inactive(file): 16860 kB
Unevictable: 11948 kB
Mlocked: 11948 kB
SwapTotal: 52428796 kB
SwapFree: 52373500 kB
Dirty: 0 kB
Writeback: 464 kB
AnonPages: 55112 kB
Mapped: 56168 kB
Shmem: 2296 kB
KReclaimable: 2084224 kB
Slab: 65863304 kB
SReclaimable: 2084224 kB
SUnreclaim: 63779080 kB
KernelStack: 13856 kB
PageTables: 3964 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 118261712 kB
Committed_AS: 361476 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 2948856 kB
VmallocChunk: 0 kB
Percpu: 31232 kB
HardwareCorrupted: 0 kB
AnonHugePages: 0 kB
ShmemHugePages: 0 kB
ShmemPmdMapped: 0 kB
FileHugePages: 0 kB
FilePmdMapped: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
Hugetlb: 0 kB
DirectMap4k: 88585508 kB
DirectMap2M: 43186176 kB
DirectMap1G: 4194304 kB
systemd-cgtop -m -n 1
Control Group Tasks %CPU Memory Input/s Output/s
/ 787 - 86.1G - -
system.slice 98 - 2.0G - -
system.slice/cron.service 1 - 1.9G - -
system.slice/systemd-journald.service 1 - 32.5M - -
其余内容不到 30 MB
slabtop --once
Active / Total Objects (% used) : 99549552 / 109644976 (90.8%)
Active / Total Slabs (% used) : 3498896 / 3498896 (100.0%)
Active / Total Caches (% used) : 127 / 196 (64.8%)
Active / Total Size (% used) : 61511851.41K / 65204041.81K (94.3%)
Minimum / Average / Maximum Object : 0.01K / 0.59K / 16.00K
OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
16127096 14742570 91% 0.50K 503977 32 8063632K kmalloc-512
15970928 14903038 93% 0.38K 380264 42 6084224K dmu_buf_impl_t
15562646 14573457 93% 0.97K 483843 33 15482976K dnode_t
13621377 13469293 98% 0.24K 412769 33 3302152K sa_cache
13543561 13468664 99% 1.08K 649473 29 20783136K zfs_znode_cache
10347792 10347792 100% 0.19K 246376 42 1971008K dentry
5660031 5660011 99% 0.31K 110981 51 1775696K arc_buf_hdr_t_full
5105792 3032351 59% 0.03K 39889 128 159556K kmalloc-32
3003328 2081202 69% 0.06K 46927 64 187708K kmalloc-64
2873949 2097731 72% 0.10K 73691 39 294764K abd_t
2225216 272877 12% 0.25K 69538 32 556304K kmalloc-256
672042 670950 99% 0.09K 16001 42 64004K kmalloc-96
619191 488804 78% 0.08K 12141 51 48564K arc_buf_t
565984 538604 95% 1.00K 17687 32 565984K kmalloc-1k
485063 460395 94% 8.00K 122762 4 3928384K kmalloc-8k
483359 463686 95% 16.00K 244241 2 7815712K zio_buf_comb_16384
418272 215389 51% 1.00K 13071 32 418272K zio_buf_comb_1024
392576 392576 100% 0.06K 6134 64 24536K kmalloc-rcl-64
326080 326080 100% 0.12K 10190 32 40760K scsi_sense_cache
256494 44775 17% 0.19K 6107 42 48856K kmalloc-192
189224 189224 100% 0.07K 3379 56 13516K Acpi-Operand
我需要降低内存使用量。我可以减少 zfs 缓存,但它不是唯一占用内存的地方,因此减少 35 GB 的未知内存可能更有意义。
我怎样才能找出正在使用内存的东西?