如何解决 FreeBSD 10 下的 mbuf 耗尽问题?

如何解决 FreeBSD 10 下的 mbuf 耗尽问题?

最近从 FreeBSD 10.3-STABLE 升级到 FreeBSD 10.3-RELEASE-p21 的 FreeBSD 服务器出现 mbuf 耗尽的情况。在正常操作过程中,我们看到 mbuf 使用量稳步增加,直到达到kern.ipc.nmbufs限制,此时机器在网络上变得无响应(由于缺少用于网络访问的 mbuf),并且控制台显示:

cxl0: Interface stopped DISTRIBUTING, possible flapping
cxl1: Interface stopped DISTRIBUTING, possible flapping
[zone: mbuf] kern.ipc.nmbufs limit reached
[zone: mbuf] kern.ipc.nmbufs limit reached

该机器运行 pf 并充当数据包过滤器、路由器、网关和 DHCP/DNS 服务器。它有两个 Chelsio NIC,是一个带有辅助的 CARP 主设备。辅助设备具有相同的硬件和软件配置,并且不存在此问题。

考虑到这会导致停机,我们设置了 Nagios/Check_MK 来绘制输出netstat -m并在mbufs in use接近时发出警报kern.ipc.nmbufs,我们看到 mbuf 使用量呈稳定的线性增长,直到我们重新启动:

stairway to heaven... where servers go when they die

正在使用的 mbuf 簇发生这种情况时不会改变,并且增加 mbuf 集群限制没有效果:

mbuf clusters in use

对我来说,这似乎是某种内核错误,正在寻求有关进一步故障排除的建议或解决此问题的帮助!


有用的(也许)信息:

netstat -m

679270/3080/682350 mbufs in use (current/cache/total)
10243/1657/11900/985360 mbuf clusters in use (current/cache/total/max)
10243/1648 mbuf+clusters out of packet secondary zone in use (current/cache)
8128/482/8610/124025 4k (page size) jumbo clusters in use (current/cache/total/max)
0/0/0/36748 9k jumbo clusters in use (current/cache/total/max)
128/0/128/20670 16k jumbo clusters in use (current/cache/total/max)
224863K/6012K/230875K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters delayed (4k/9k/16k)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile

vmstat -z|grep -E '^ITEM|mbuf'

ITEM                   SIZE  LIMIT     USED     FREE      REQ FAIL SLEEP
mbuf_packet:            256, 1587540,   10239,    1652,84058893,   0,   0
mbuf:                   256, 1587540,  671533,    1206,914478880,   0,   0
mbuf_cluster:          2048, 985360,   11891,       9,   11891,   0,   0
mbuf_jumbo_page:       4096, 124025,    8128,     512,15011847,   0,   0
mbuf_jumbo_9k:         9216,  36748,       0,       0,       0,   0,   0
mbuf_jumbo_16k:       16384,  20670,     128,       0,     128,   0,   0
mbuf_ext_refcnt:          4,      0,       0,       0,       0,   0,   0

vmstat -m

         Type InUse MemUse HighUse Requests  Size(s)
 NFSD lckfile     1     1K       -        1  256
     filedesc   103   383K       -  1134731  16,32,128,2048,4096,8192,16384,65536
        sigio     1     1K       -        1  64
     filecaps     0     0K       -      973  64
      kdtrace   292    59K       -  1099386  64,256
         kenv   121    13K       -      125  16,32,64,128,8192
       kqueue    14    22K       -     5374  256,2048,8192
    proc-args    54     5K       -   578448  16,32,64,128,256
        hhook     2     1K       -        2  256
      ithread   146    24K       -      146  32,128,256
       KTRACE   100    13K       -      100  128
       NFS fh     1     1K       -      584  32
       linker   207  1052K       -      234  16,32,64,128,256,512,1024,2048,4096,8192,16384,65536
        lockf    29     3K       -    20042  64,128
   loginclass     2     1K       -     1192  64
       devbuf 17205 36362K       -    17523  16,32,64,128,256,512,1024,2048,4096,8192,65536
         temp   149    51K       -  1280113  16,32,64,128,256,512,1024,2048,4096,8192,16384,65536
       ip6opt     5     2K       -        6  256
       ip6ndp    27     2K       -       27  64,128
       module   230    29K       -      230  128
     mtx_pool     2    16K       -        2  8192
          osd     3     1K       -        5  16,32,64
     pmchooks     1     1K       -        1  128
         pgrp    30     4K       -     2222  128
      session    29     4K       -     2187  128
         proc     2    32K       -        2  16384
      subproc   211   368K       -  1099014  512,4096
         cred   204    32K       -  6025704  64,256
       plimit    19     5K       -     3985  256
      uidinfo     9     5K       -    11892  128,4096
 NFSD session     1     1K       -        1  1024
       sysctl     0     0K       -    63851  16,32,64
    sysctloid  7196   365K       -     7369  16,32,64,128
    sysctltmp     0     0K       -    17834  16,32,64,128
      tidhash     1    32K       -        1  32768
      callout     5  2184K       -        5  
         umtx   522    66K       -      522  128
     p1003.1b     1     1K       -        1  16
         SWAP     2   549K       -        2  64
          bus   802    86K       -     6536  16,32,64,128,256,1024
       bus-sc    57  1671K       -     2431  16,32,64,128,256,512,1024,2048,4096,8192,16384,65536
    newnfsmnt     1     1K       -        1  1024
      devstat     8    17K       -        8  32,4096
 eventhandler   116    10K       -      116  64,128
         kobj   124   496K       -      296  4096
     acpiintr     1     1K       -        1  64
      Per-cpu     1     1K       -        1  32
       acpica 14355  1420K       -   216546  16,32,64,128,256,512,1024,2048,4096
     pci_link    16     2K       -       16  64,128
    pfs_nodes    21     6K       -       21  256
         rman   316    37K       -      716  16,32,128
         sbuf     1     1K       -    41375  16,32,64,128,256,512,1024,2048,4096,8192,16384
       sglist     8     8K       -        8  1024
         GEOM    88    15K       -     1871  16,32,64,128,256,512,1024,2048,8192,16384
      acpipwr     5     1K       -        5  64
    taskqueue    43     7K       -       43  16,32,256
       Unitno    22     2K       -  1208250  32,64
         vmem     3   144K       -        6  1024,4096,8192
     ioctlops     0     0K       -   185700  256,512,1024,2048,4096
       select    89    12K       -       89  128
          iov     0     0K       - 19808992  16,64,128,256,512,1024
          msg     4    30K       -        4  2048,4096,8192,16384
          sem     4   106K       -        4  2048,4096
          shm     1    32K       -        1  32768
          tty    20    20K       -      499  1024
          pts     1     1K       -      480  256
         accf     2     1K       -        2  64
     mbuf_tag     0     0K       - 291472282  32,64,128
        shmfd     1     8K       -        1  8192
       soname    32     4K       -  1210442  16,32,128
          pcb    36   663K       -    76872  16,32,64,128,1024,2048,8192
      CAM CCB     0     0K       -   182128  2048
          acl     0     0K       -        2  4096
     vfscache     1  2048K       -        1  
   cl_savebuf     0     0K       -      480  64
     vfs_hash     1  1024K       -        1  
       vnodes     1     1K       -        1  256
      entropy  1026    65K       -    49107  32,64,4096
        mount    64     3K       -      140  16,32,64,128,256
  vnodemarker     0     0K       -     4212  512
          BPF   112 20504K       -      131  16,64,128,512,4096
     CAM path    11     1K       -       63  32
        ifnet    29    57K       -       30  128,256,2048
       ifaddr   315   105K       -      315  32,64,128,256,512,2048,4096
  ether_multi   232    13K       -      282  16,32,64
        clone    10     2K       -       10  128
       arpcom    23     1K       -       23  16
          gif     4     1K       -        4  32,256
      lltable   155    53K       -      551  256,512
         UART     6     5K       -        6  16,1024
         vlan    56     5K       -       74  64,128
     acpitask     1    16K       -        1  16384
      acpisem   110    14K       -      110  128
    raid_data     0     0K       -      108  32,128,256
     routetbl   516   136K       -   101735  32,64,128,256,512
         igmp    28     7K       -       28  256
         CARP    76    30K       -       83  16,32,64,128,256,512,1024
         ipid     2    24K       -        2  8192,16384
   in_mfilter   112   112K       -      112  1024
     in_multi    43    11K       -       43  256
  ip_moptions   224    35K       -      224  64,256
   CAM periph     7     2K       -       19  16,32,64,128,256
      acpidev   128     8K       -      128  64
    CAM queue    15     5K       -       39  16,32,512
encap_export_host     4     4K       -        4  1024
    sctp_a_it     0     0K       -       36  16
     sctp_vrf     1     1K       -        1  64
     sctp_ifa   115    15K       -      204  128
     sctp_ifn    21     3K       -       23  128
    sctp_iter     0     0K       -       36  256
    hostcache     1    32K       -        1  32768
     syncache     1    64K       -        1  65536
  in6_mfilter     1     1K       -        1  1024
    in6_multi    15     2K       -       15  32,256
 ip6_moptions     2     1K       -        2  32,256
CAM dev queue     6     1K       -        6  64
       kbdmux     6    22K       -        6  16,512,1024,2048,16384
          mld    26     4K       -       26  128
          LED    20     2K       -       20  16,128
  inpcbpolicy   365    12K       -   119277  32
     secasvar     7     2K       -      214  256
       sahead    10     3K       -       10  256
  ipsecpolicy   748   187K       -   241562  256
 ipsecrequest    18     3K       -       72  128
   ipsec-misc    56     2K       -     1712  16,32,64
    ipsec-saq     0     0K       -       24  128
    ipsec-reg     3     1K       -        3  32
       pfsync     2     2K       -      893  32,256,1024
      pf_temp     0     0K       -       78  128
      pf_hash     3  2880K       -        3  
     pf_ifnet    36    11K       -     9510  256,2048
       pf_tag     7     1K       -        7  128
      pf_altq     5     2K       -      125  256
      pf_rule   964   904K       -    17500  128,1024
      pf_osfp  1130   115K       -    28250  64,128
     pf_table    49    98K       -      948  2048
       crypto    37    11K       -     1072  64,128,256,512,1024
        xform     7     1K       -  1530156  16,32,64,128,256
          rpc    12    20K       -      304  64,128,512,1024,8192
audit_evclass   187     6K       -      231  32
  ufs_dirhash    93    18K       -       93  16,32,64,128,256,512
    ufs_quota     1  1024K       -        1  
    ufs_mount     3    13K       -        3  512,4096,8192
    vm_pgdata     2   513K       -        2  128
      UMAHash     5     6K       -       10  512,1024,2048
      CAM SIM     6     2K       -        6  256
      CAM XPT    30     3K       -     1850  16,32,64,128,256,512,1024,2048,65536
      CAM DEV     9    18K       -       16  2048
  fpukern_ctx     3     6K       -        3  2048
      memdesc     1     4K       -        1  4096
          USB    23    33K       -       24  16,128,256,512,1024,2048,4096
       DEVFS3   136    34K       -     2027  256
       DEVFS1   108    54K       -      594  512
       apmdev     1     1K       -        1  128
   madt_table     0     0K       -        1  4096
   DEVFS_RULE    55    26K       -       55  64,512
        DEVFS    12     1K       -       13  16,128
       DEVFSP    22     2K       -      167  64
      io_apic     1     2K       -        1  2048
       isadev     8     1K       -        8  128
          MCA    15     2K       -       15  32,128
          msi    30     4K       -       30  128
     nexusdev     5     1K       -        5  16
       USBdev    21     8K       -       21  32,64,128,256,512,1024,4096
NFSD V4client     1     1K       -        1  256
         cdev     5     2K       -        5  256
        cxgbe    41   956K       -       44  128,256,512,1024,2048,4096,8192,16384
         ipmi     0     0K       -    20155  128,2048
    htcp data   127     4K       -    13675  32
   aesni_data     3     3K       -        3  1024
      solaris   142 12302K       -     3189  16,32,64,128,512,1024,8192
   kstat_data     6     1K       -        6  64

TCP 状态:

TCP States Graph

答案1

看起来绝对像是一个内核错误。您可以尝试升级到 FreeBSD 11.1 吗?

相关内容