内核杀死有可用内存的进程

内核杀死有可用内存的进程

我们有一台 CentOS5.8 服务器,内存为 64GB,通常使用内存 700MB 左右。当我尝试对tar -czf大约 22GB 的存档进行操作时,页面缓存会上升到大约 22GB,然后内核开始终止进程​​。最终,即使插入 USB 键盘,服务器也会变得无响应。

Fri Aug  5 11:00:01 CDT 2016
 11:00:01 up 1 day,  2:06,  3 users,  load average: 1.31, 1.71, 1.77
             total       used       free     shared    buffers     cached
Mem:         64237      22874      41363          0          0      22407
-/+ buffers/cache:        466      63771
Swap:         5279          0       5279
*****************************************************
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                      534G  106G  401G  21% /
/dev/sda3              97M   13M   80M  14% /boot
tmpfs                  32G     0   32G   0% /dev/shm
*****************************************************
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0      0 42355760   1020 22944876    0    0    33    19    7    6  0  0 100  0  0
 0  0      0 42355760   1020 22945204    0    0     0     0 1003 2933  0  0 100  0  0
 0  0      0 42355760   1020 22945220    0    0     0     0 1040 2963  0  0 100  0  0
 0  0      0 42355760   1020 22945220    0    0     0     0 1004 2843  0  0 100  0  0
 0  0      0 42355760   1020 22945220    0    0     0     0 1036 2915  0  0 100  0  0

以及流程

root       993  0.0  0.0 S<s  00:00:00 /sbin/udevd -d
root      3578  0.0  0.0 S<Lsl 00:00:00 iscsiuio
root      3582  0.0  0.0 Ss   00:00:00 iscsid
root      3583  0.0  0.0 S<Ls 00:00:00 iscsid
root      4377  0.0  0.0 S<sl 00:00:00 auditd
root      4379  0.0  0.0 S<sl 00:00:00  \_ /sbin/audispd
root      4401  0.0  0.0 Ss   00:00:00 syslogd -m 0
root      4404  0.0  0.0 Ss   00:00:00 klogd -x
root      4540  0.0  0.0 Ss   00:00:00 irqbalance
rpc       4574  0.0  0.0 Ss   00:00:00 portmap
rpcuser   4655  0.0  0.0 Ss   00:00:00 rpc.statd
root      4683  0.0  0.0 Ss   00:00:00 rpc.idmapd
dbus      4705  0.0  0.0 Ss   00:00:00 dbus-daemon --system
root      4714  0.0  0.0 Ss   00:00:00 /usr/sbin/hcid
root      4720  0.0  0.0 Ss   00:00:00 /usr/sbin/sdpd
root      4786  0.0  0.0 S<   00:00:00 [krfcommd]
root      4823  0.0  0.0 Ssl  00:00:00 pcscd
root      4833  0.0  0.0 Ss   00:00:00 /usr/sbin/acpid
68        4846  0.0  0.0 Ss   00:00:01 hald
root      4847  0.0  0.0 S    00:00:00  \_ hald-runner
68        4855  0.0  0.0 S    00:00:00      \_ hald-addon-acpi: listening on acpid socket /var/run/acpid.socket
68        4862  0.0  0.0 S    00:00:00      \_ hald-addon-keyboard: listening on /dev/input/event0
68        4865  0.0  0.0 S    00:00:00      \_ hald-addon-keyboard: listening on /dev/input/event3
68        4868  0.0  0.0 S    00:00:00      \_ hald-addon-keyboard: listening on /dev/input/event2
root      4876  0.0  0.0 S    00:00:13      \_ hald-addon-storage: polling /dev/sr0
root      4878  0.0  0.0 S    00:00:00      \_ hald-addon-storage: polling /dev/sdb
root      4912  0.0  0.0 Ss   00:00:00 /usr/bin/hidd --server
root      4956  0.0  0.0 Ssl  00:00:00 automount
root      4973  0.0  0.0 Ss   00:00:00 ./hpiod
root      4978  0.0  0.0 S    00:00:00 /usr/bin/python ./hpssd.py
root      4991  0.0  0.0 Ss   00:00:00 /usr/sbin/sshd
root     12541  0.0  0.0 Ss   00:00:01  \_ sshd: root@pts/1
root     12543  0.0  0.0 Ss+  00:00:00  |   \_ -bash
root     13224  0.0  0.0 Ss   00:00:00  \_ sshd: metabk1 [priv]
metabk1  13226  0.0  0.0 S    00:00:00      \_ sshd: metabk1@pts/2
metabk1  13227  0.0  0.0 Ss+  00:00:00          \_ -bash
root      5000  0.0  0.0 Ss   00:00:00 cupsd
root      5023  0.0  0.0 Ss   00:00:06 sendmail: accepting connections
smmsp     5031  0.0  0.0 Ss   00:00:00 sendmail: Queue runner@01:00:00 for /var/spool/clientmqueue
root      5041  0.0  0.0 Ss   00:00:00 gpm -m /dev/input/mice -t exps2
root      5050  0.0  0.0 Ss   00:00:00 crond
root     13284  0.0  0.0 S    00:00:00  \_ crond
root     13285  0.0  0.0 Ss   00:00:00      \_ /bin/sh -c /usr/local/bin/memmon.sh >> /var/log/memmon.txt
root     13286  0.0  0.0 S    00:00:00          \_ /bin/bash /usr/local/bin/memmon.sh
root     13292  0.0  0.0 R    00:00:00              \_ ps axfww -o ruser,pid,%cpu,%mem,stat,time,cmd
xfs       5079  0.0  0.0 Ss   00:00:00 xfs -droppriv -daemon
root      5096  0.0  0.0 Ss   00:00:00 /usr/sbin/atd
avahi     5135  0.0  0.0 Ss   00:00:00 avahi-daemon: running [RDVR.local]
avahi     5136  0.0  0.0 Ss   00:00:00  \_ avahi-daemon: chroot helper
root      5197  0.0  0.0 S    00:00:00 /usr/sbin/smartd -q never
root      5202  0.0  0.0 Ss+  00:00:00 /sbin/mingetty tty1
root      5203  0.0  0.0 Ss   00:00:00 login -- root
root     13025  0.0  0.0 Ss   00:00:00  \_ -bash
root     13060  0.7  0.0 S+   00:00:14      \_ top -M
root      5204  0.0  0.0 Ss+  00:00:00 /sbin/mingetty tty3
root      5205  0.0  0.0 Ss+  00:00:00 /sbin/mingetty tty4
root      5206  0.0  0.0 Ss+  00:00:00 /sbin/mingetty tty5
root      5208  0.0  0.0 Ss+  00:00:00 /sbin/mingetty tty6
root      5209  0.0  0.0 Ss   00:00:00 /usr/sbin/gdm-binary -nodaemon
root     13202  0.0  0.0 S    00:00:00  \_ /usr/sbin/gdm-binary -nodaemon
root     13203  0.6  0.0 Ss+  00:00:02      \_ /usr/bin/Xorg :0 -br -audit 0 -auth /var/gdm/:0.Xauth -nolisten tcp vt7
gdm      13216  0.1  0.0 Ss   00:00:00      \_ /usr/libexec/gdmgreeter
root      5251  0.0  0.0 Sl   00:00:00 /usr/libexec/gdm-rh-security-token-helper
root      5257  0.0  0.0 SN   00:00:00 /usr/bin/python -tt /usr/sbin/yum-updatesd
root     13217  0.1  0.0 SN   00:00:00  \_ /usr/bin/python -tt /usr/libexec/yum-updatesd-helper --check --dbus
root      5259  0.0  0.0 SN   00:00:00 /usr/libexec/gam_server

最后是一些日志

DMESG:+Free pages:    42316788kB (42309540kB HighMem)
DMESG:+Active:28543 inactive:5737934 dirty:167940 writeback:0 unstable:0 free:10579197 slab:92371 mapped-file:5498 mapped-anon:21203 pagetables:727
DMESG:+DMA free:3588kB min:68kB low:84kB high:100kB active:12kB inactive:0kB present:16384kB pages_scanned:56981 all_unreclaimable? yes
DMESG:+lowmem_reserve[]: 0 0 880 65520
DMESG:+DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
DMESG:+lowmem_reserve[]: 0 0 880 65520
DMESG:+Normal free:3660kB min:3756kB low:4692kB high:5632kB active:424kB inactive:44kB present:901120kB pages_scanned:621593 all_unreclaimable? yes
DMESG:+lowmem_reserve[]: 0 0 0 517120
DMESG:+HighMem free:42309540kB min:512kB low:69552kB high:138592kB active:113840kB inactive:22951588kB present:66191360kB pages_scanned:0 all_unreclaimable? no
DMESG:+lowmem_reserve[]: 0 0 0 0
DMESG:+DMA: 1*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3588kB
DMESG:+DMA32: empty
DMESG:+Normal: 1*4kB 23*8kB 1*16kB 0*32kB 0*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3660kB
DMESG:+HighMem: 1*4kB 0*8kB 0*16kB 1*32kB 0*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 10329*4096kB = 42309540kB
DMESG:+5745312 pagecache pages
DMESG:+Swap cache: add 0, delete 0, find 0/0, race 0+0
DMESG:+Free swap  = 5406712kB
DMESG:+Total swap = 5406712kB
DMESG:+Free swap:       5406712kB
DMESG:+16777216 pages of RAM
DMESG:+16547840 pages of HIGHMEM
DMESG:+332415 reserved pages
DMESG:+5728571 pages shared
DMESG:+0 pages swap cached
DMESG:+167940 pages dirty
DMESG:+0 pages writeback
DMESG:+5498 pages mapped
DMESG:+92360 pages slab
DMESG:+727 pages pagetables
DMESG:+Out of memory: Killed process 12751, UID 503, (bash).
DMESG:+top invoked oom-killer: gfp_mask=0xd0, order=0, oomkilladj=0
DMESG:+ [<c045b9c3>] out_of_memory+0x72/0x1a3
DMESG:+ [<c045cf46>] __alloc_pages+0x24e/0x2cf
DMESG:+ [<c04a8ef3>] proc_info_read+0x0/0x96
DMESG:+ [<c045cfee>] __get_free_pages+0x27/0x33
DMESG:+ [<c04a8f2b>] proc_info_read+0x38/0x96
DMESG:+ [<c04a8ef3>] proc_info_read+0x0/0x96
DMESG:+ [<c04770e5>] vfs_read+0x9f/0x141
DMESG:+ [<c047756b>] sys_read+0x3c/0x63
DMESG:+ [<c0404f4b>] syscall_call+0x7/0xb
DMESG:+ =======================

答案1

您正在运行 32 位内核(而且是过时的内核)。这意味着任何单个进程的地址空间都限制在略小于 4GB 的范围内。此系统最初应该安装的是 64 位操作系统。(同时,请考虑使用较新的发行版。CentOS 5 即将停止支持。)

相关内容