最近从 FreeBSD 10.3-STABLE 升级到 FreeBSD 10.3-RELEASE-p21 的 FreeBSD 服务器出现 mbuf 耗尽的情况。在正常操作过程中,我们看到 mbuf 使用量稳步增加,直到达到kern.ipc.nmbufs
限制,此时机器在网络上变得无响应(由于缺少用于网络访问的 mbuf),并且控制台显示:
cxl0: Interface stopped DISTRIBUTING, possible flapping
cxl1: Interface stopped DISTRIBUTING, possible flapping
[zone: mbuf] kern.ipc.nmbufs limit reached
[zone: mbuf] kern.ipc.nmbufs limit reached
该机器运行 pf 并充当数据包过滤器、路由器、网关和 DHCP/DNS 服务器。它有两个 Chelsio NIC,是一个带有辅助的 CARP 主设备。辅助设备具有相同的硬件和软件配置,并且不存在此问题。
考虑到这会导致停机,我们设置了 Nagios/Check_MK 来绘制输出netstat -m
并在mbufs in use
接近时发出警报kern.ipc.nmbufs
,我们看到 mbuf 使用量呈稳定的线性增长,直到我们重新启动:
正在使用的 mbuf 簇发生这种情况时不会改变,并且增加 mbuf 集群限制没有效果:
对我来说,这似乎是某种内核错误,正在寻求有关进一步故障排除的建议或解决此问题的帮助!
有用的(也许)信息:
netstat -m
:
679270/3080/682350 mbufs in use (current/cache/total)
10243/1657/11900/985360 mbuf clusters in use (current/cache/total/max)
10243/1648 mbuf+clusters out of packet secondary zone in use (current/cache)
8128/482/8610/124025 4k (page size) jumbo clusters in use (current/cache/total/max)
0/0/0/36748 9k jumbo clusters in use (current/cache/total/max)
128/0/128/20670 16k jumbo clusters in use (current/cache/total/max)
224863K/6012K/230875K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters delayed (4k/9k/16k)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
vmstat -z|grep -E '^ITEM|mbuf'
:
ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP
mbuf_packet: 256, 1587540, 10239, 1652,84058893, 0, 0
mbuf: 256, 1587540, 671533, 1206,914478880, 0, 0
mbuf_cluster: 2048, 985360, 11891, 9, 11891, 0, 0
mbuf_jumbo_page: 4096, 124025, 8128, 512,15011847, 0, 0
mbuf_jumbo_9k: 9216, 36748, 0, 0, 0, 0, 0
mbuf_jumbo_16k: 16384, 20670, 128, 0, 128, 0, 0
mbuf_ext_refcnt: 4, 0, 0, 0, 0, 0, 0
vmstat -m
:
Type InUse MemUse HighUse Requests Size(s)
NFSD lckfile 1 1K - 1 256
filedesc 103 383K - 1134731 16,32,128,2048,4096,8192,16384,65536
sigio 1 1K - 1 64
filecaps 0 0K - 973 64
kdtrace 292 59K - 1099386 64,256
kenv 121 13K - 125 16,32,64,128,8192
kqueue 14 22K - 5374 256,2048,8192
proc-args 54 5K - 578448 16,32,64,128,256
hhook 2 1K - 2 256
ithread 146 24K - 146 32,128,256
KTRACE 100 13K - 100 128
NFS fh 1 1K - 584 32
linker 207 1052K - 234 16,32,64,128,256,512,1024,2048,4096,8192,16384,65536
lockf 29 3K - 20042 64,128
loginclass 2 1K - 1192 64
devbuf 17205 36362K - 17523 16,32,64,128,256,512,1024,2048,4096,8192,65536
temp 149 51K - 1280113 16,32,64,128,256,512,1024,2048,4096,8192,16384,65536
ip6opt 5 2K - 6 256
ip6ndp 27 2K - 27 64,128
module 230 29K - 230 128
mtx_pool 2 16K - 2 8192
osd 3 1K - 5 16,32,64
pmchooks 1 1K - 1 128
pgrp 30 4K - 2222 128
session 29 4K - 2187 128
proc 2 32K - 2 16384
subproc 211 368K - 1099014 512,4096
cred 204 32K - 6025704 64,256
plimit 19 5K - 3985 256
uidinfo 9 5K - 11892 128,4096
NFSD session 1 1K - 1 1024
sysctl 0 0K - 63851 16,32,64
sysctloid 7196 365K - 7369 16,32,64,128
sysctltmp 0 0K - 17834 16,32,64,128
tidhash 1 32K - 1 32768
callout 5 2184K - 5
umtx 522 66K - 522 128
p1003.1b 1 1K - 1 16
SWAP 2 549K - 2 64
bus 802 86K - 6536 16,32,64,128,256,1024
bus-sc 57 1671K - 2431 16,32,64,128,256,512,1024,2048,4096,8192,16384,65536
newnfsmnt 1 1K - 1 1024
devstat 8 17K - 8 32,4096
eventhandler 116 10K - 116 64,128
kobj 124 496K - 296 4096
acpiintr 1 1K - 1 64
Per-cpu 1 1K - 1 32
acpica 14355 1420K - 216546 16,32,64,128,256,512,1024,2048,4096
pci_link 16 2K - 16 64,128
pfs_nodes 21 6K - 21 256
rman 316 37K - 716 16,32,128
sbuf 1 1K - 41375 16,32,64,128,256,512,1024,2048,4096,8192,16384
sglist 8 8K - 8 1024
GEOM 88 15K - 1871 16,32,64,128,256,512,1024,2048,8192,16384
acpipwr 5 1K - 5 64
taskqueue 43 7K - 43 16,32,256
Unitno 22 2K - 1208250 32,64
vmem 3 144K - 6 1024,4096,8192
ioctlops 0 0K - 185700 256,512,1024,2048,4096
select 89 12K - 89 128
iov 0 0K - 19808992 16,64,128,256,512,1024
msg 4 30K - 4 2048,4096,8192,16384
sem 4 106K - 4 2048,4096
shm 1 32K - 1 32768
tty 20 20K - 499 1024
pts 1 1K - 480 256
accf 2 1K - 2 64
mbuf_tag 0 0K - 291472282 32,64,128
shmfd 1 8K - 1 8192
soname 32 4K - 1210442 16,32,128
pcb 36 663K - 76872 16,32,64,128,1024,2048,8192
CAM CCB 0 0K - 182128 2048
acl 0 0K - 2 4096
vfscache 1 2048K - 1
cl_savebuf 0 0K - 480 64
vfs_hash 1 1024K - 1
vnodes 1 1K - 1 256
entropy 1026 65K - 49107 32,64,4096
mount 64 3K - 140 16,32,64,128,256
vnodemarker 0 0K - 4212 512
BPF 112 20504K - 131 16,64,128,512,4096
CAM path 11 1K - 63 32
ifnet 29 57K - 30 128,256,2048
ifaddr 315 105K - 315 32,64,128,256,512,2048,4096
ether_multi 232 13K - 282 16,32,64
clone 10 2K - 10 128
arpcom 23 1K - 23 16
gif 4 1K - 4 32,256
lltable 155 53K - 551 256,512
UART 6 5K - 6 16,1024
vlan 56 5K - 74 64,128
acpitask 1 16K - 1 16384
acpisem 110 14K - 110 128
raid_data 0 0K - 108 32,128,256
routetbl 516 136K - 101735 32,64,128,256,512
igmp 28 7K - 28 256
CARP 76 30K - 83 16,32,64,128,256,512,1024
ipid 2 24K - 2 8192,16384
in_mfilter 112 112K - 112 1024
in_multi 43 11K - 43 256
ip_moptions 224 35K - 224 64,256
CAM periph 7 2K - 19 16,32,64,128,256
acpidev 128 8K - 128 64
CAM queue 15 5K - 39 16,32,512
encap_export_host 4 4K - 4 1024
sctp_a_it 0 0K - 36 16
sctp_vrf 1 1K - 1 64
sctp_ifa 115 15K - 204 128
sctp_ifn 21 3K - 23 128
sctp_iter 0 0K - 36 256
hostcache 1 32K - 1 32768
syncache 1 64K - 1 65536
in6_mfilter 1 1K - 1 1024
in6_multi 15 2K - 15 32,256
ip6_moptions 2 1K - 2 32,256
CAM dev queue 6 1K - 6 64
kbdmux 6 22K - 6 16,512,1024,2048,16384
mld 26 4K - 26 128
LED 20 2K - 20 16,128
inpcbpolicy 365 12K - 119277 32
secasvar 7 2K - 214 256
sahead 10 3K - 10 256
ipsecpolicy 748 187K - 241562 256
ipsecrequest 18 3K - 72 128
ipsec-misc 56 2K - 1712 16,32,64
ipsec-saq 0 0K - 24 128
ipsec-reg 3 1K - 3 32
pfsync 2 2K - 893 32,256,1024
pf_temp 0 0K - 78 128
pf_hash 3 2880K - 3
pf_ifnet 36 11K - 9510 256,2048
pf_tag 7 1K - 7 128
pf_altq 5 2K - 125 256
pf_rule 964 904K - 17500 128,1024
pf_osfp 1130 115K - 28250 64,128
pf_table 49 98K - 948 2048
crypto 37 11K - 1072 64,128,256,512,1024
xform 7 1K - 1530156 16,32,64,128,256
rpc 12 20K - 304 64,128,512,1024,8192
audit_evclass 187 6K - 231 32
ufs_dirhash 93 18K - 93 16,32,64,128,256,512
ufs_quota 1 1024K - 1
ufs_mount 3 13K - 3 512,4096,8192
vm_pgdata 2 513K - 2 128
UMAHash 5 6K - 10 512,1024,2048
CAM SIM 6 2K - 6 256
CAM XPT 30 3K - 1850 16,32,64,128,256,512,1024,2048,65536
CAM DEV 9 18K - 16 2048
fpukern_ctx 3 6K - 3 2048
memdesc 1 4K - 1 4096
USB 23 33K - 24 16,128,256,512,1024,2048,4096
DEVFS3 136 34K - 2027 256
DEVFS1 108 54K - 594 512
apmdev 1 1K - 1 128
madt_table 0 0K - 1 4096
DEVFS_RULE 55 26K - 55 64,512
DEVFS 12 1K - 13 16,128
DEVFSP 22 2K - 167 64
io_apic 1 2K - 1 2048
isadev 8 1K - 8 128
MCA 15 2K - 15 32,128
msi 30 4K - 30 128
nexusdev 5 1K - 5 16
USBdev 21 8K - 21 32,64,128,256,512,1024,4096
NFSD V4client 1 1K - 1 256
cdev 5 2K - 5 256
cxgbe 41 956K - 44 128,256,512,1024,2048,4096,8192,16384
ipmi 0 0K - 20155 128,2048
htcp data 127 4K - 13675 32
aesni_data 3 3K - 3 1024
solaris 142 12302K - 3189 16,32,64,128,512,1024,8192
kstat_data 6 1K - 6 64
TCP 状态:
答案1
看起来绝对像是一个内核错误。您可以尝试升级到 FreeBSD 11.1 吗?