我正在运行一个Debian GNU/Linux 9.5(延伸)与内核:4.9.0-7-amd64。
我发现我面临的内存消耗问题的罪魁祸首是用于将日志发送到 FluentD 守护进程的应用内机制,因此我试图找出 TCP 内存使用情况。
根据以下输出
/proc/net/sockstat
:
sockets: used 779
TCP: inuse 23 orphan 0 tw 145 alloc 177 mem 4451
UDP: inuse 5 mem 2
UDPLITE: inuse 0
RAW: inuse 0
FRAG: inuse 0 memory 0
mem
metric 是 TCP 内存使用的页数 (4K)。所以 TCP 内存使用等于4451 * 4 = 17804k
ss -atmp
:
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 0 128 *:ssh *:* users:(("sshd",pid=559,fd=3))
skmem:(r0,rb12582912,t0,tb12582912,f0,w0,o0,bl0,d712)
LISTEN 0 4096 127.0.0.1:8125 *:* users:(("netdata",pid=21419,fd=33))
skmem:(r0,rb33554432,t0,tb12582912,f0,w0,o0,bl0,d0)
LISTEN 0 4096 *:19999 *:* users:(("netdata",pid=21419,fd=4))
skmem:(r0,rb33554432,t0,tb12582912,f0,w0,o0,bl0,d0)
LISTEN 0 32768 *:3999 *:* users:(("protokube",pid=3504,fd=9))
skmem:(r0,rb12582912,t0,tb12582912,f0,w0,o0,bl0,d0)
LISTEN 0 32768 127.0.0.1:19365 *:* users:(("kubelet",pid=2607,fd=10))
skmem:(r0,rb12582912,t0,tb12582912,f0,w0,o0,bl0,d0)
LISTEN 0 32768 127.0.0.1:10248 *:* users:(("kubelet",pid=2607,fd=29))
skmem:(r0,rb12582912,t0,tb12582912,f0,w0,o0,bl0,d0)
LISTEN 0 32768 127.0.0.1:10249 *:* users:(("kube-proxy",pid=3250,fd=10))
skmem:(r0,rb12582912,t0,tb12582912,f0,w0,o0,bl0,d0)
LISTEN 0 128 *:sunrpc *:* users:(("rpcbind",pid=232,fd=8))
skmem:(r0,rb12582912,t0,tb12582912,f0,w0,o0,bl0,d0)
ESTAB 0 0 172.18.25.47:ssh 46.198.221.224:35084 users:(("sshd",pid=20049,fd=3),("sshd",pid=20042,fd=3))
skmem:(r0,rb12582912,t0,tb12582912,f0,w0,o0,bl0,d0)
TIME-WAIT 0 0 100.96.18.1:48226 100.96.18.110:3006
ESTAB 0 0 172.18.25.47:62641 172.18.18.165:3999 users:(("protokube",pid=3504,fd=11))
skmem:(r0,rb12582912,t0,tb12582912,f0,w0,o0,bl0,d15390)
ESTAB 0 0 172.18.25.47:3999 172.18.63.198:46453 users:(("protokube",pid=3504,fd=17))
skmem:(r0,rb12582912,t0,tb12582912,f0,w0,o0,bl0,d0)
SYN-SENT 0 1 172.18.25.47:28870 172.18.23.194:4000 users:(("protokube",pid=3504,fd=3))
skmem:(r0,rb12582912,t1280,tb12582912,f2816,w1280,o0,bl0,d0)
TIME-WAIT 0 0 100.96.18.1:34744 100.96.18.108:3008
ESTAB 0 0 172.18.25.47:3999 172.18.18.165:23733 users:(("protokube",pid=3504,fd=8))
skmem:(r0,rb12582912,t0,tb12582912,f0,w0,o0,bl0,d0)
TIME-WAIT 0 0 100.96.18.1:12992 100.96.18.105:3007
TIME-WAIT 0 0 100.96.18.1:48198 100.96.18.110:3006
TIME-WAIT 0 0 100.96.18.1:63502 100.96.18.102:8001
ESTAB 0 0 127.0.0.1:10249 127.0.0.1:53868 users:(("kube-proxy",pid=3250,fd=5))
skmem:(r0,rb12582912,t0,tb12582912,f4096,w0,o0,bl0,d0)
TIME-WAIT 0 0 100.96.18.1:58032 100.96.18.101:3000
TIME-WAIT 0 0 100.96.18.1:17158 100.96.18.104:8000
ESTAB 0 0 172.18.25.47:38474 172.18.18.165:https users:(("kubelet",pid=2607,fd=38))
skmem:(r0,rb12582912,t0,tb12582912,f0,w0,o0,bl0,d112)
TIME-WAIT 0 0 100.96.18.1:17308 100.96.18.104:8000
ESTAB 0 0 127.0.0.1:32888 127.0.0.1:10255 users:(("go.d.plugin",pid=21570,fd=8))
skmem:(r0,rb12582912,t0,tb12582912,f20480,w0,o0,bl0,d3)
TIME-WAIT 0 0 100.96.18.1:57738 100.96.18.101:3000
TIME-WAIT 0 0 100.96.18.1:23650 100.96.18.97:3004
TIME-WAIT 0 0 100.96.18.1:34518 100.96.18.103:3001
ESTAB 0 0 127.0.0.1:53868 127.0.0.1:10249 users:(("go.d.plugin",pid=21570,fd=6))
skmem:(r0,rb12582912,t0,tb12582912,f8192,w0,o0,bl0,d1)
TIME-WAIT 0 0 100.96.18.1:23000 100.96.18.98:3002
ESTAB 0 0 172.18.25.47:38498 172.18.18.165:https users:(("kube-proxy",pid=3250,fd=7))
skmem:(r0,rb12582912,t0,tb12582912,f8192,w0,o0,bl0,d0)
TIME-WAIT 0 0 100.96.18.1:26430 100.96.18.100:3005
TIME-WAIT 0 0 100.96.18.1:34882 100.96.18.103:3001
ESTAB 0 0 172.18.25.47:3999 172.18.44.34:57033 users:(("protokube",pid=3504,fd=14))
skmem:(r0,rb12582912,t0,tb12582912,f0,w0,o0,bl0,d0)
ESTAB 0 0 172.18.25.47:3999 172.18.25.148:60423 users:(("protokube",pid=3504,fd=18))
skmem:(r0,rb12582912,t0,tb12582912,f0,w0,o0,bl0,d0)
ESTAB 0 0 172.18.25.47:61568 35.196.244.138:https users:(("netdata",pid=21419,fd=70))
skmem:(r0,rb12582912,t0,tb262176,f0,w0,o0,bl0,d0)
TIME-WAIT 0 0 100.96.18.1:13154 100.96.18.105:3007
ESTAB 0 0 172.18.25.47:54289 172.18.30.39:3999 users:(("protokube",pid=3504,fd=12))
skmem:(r0,rb12582912,t0,tb12582912,f4096,w0,o0,bl0,d15392)
TIME-WAIT 0 0 100.96.18.1:34718 100.96.18.108:3008
TIME-WAIT 0 0 100.96.18.1:24078 100.96.18.97:3004
LISTEN 0 128 :::ssh :::* users:(("sshd",pid=559,fd=4))
skmem:(r0,rb12582912,t0,tb12582912,f0,w0,o0,bl0,d0)
LISTEN 0 4096 :::19999 :::* users:(("netdata",pid=21419,fd=5))
skmem:(r0,rb33554432,t0,tb12582912,f0,w0,o0,bl0,d0)
LISTEN 0 32768 :::4000 :::* users:(("protokube",pid=3504,fd=5))
skmem:(r0,rb12582912,t0,tb12582912,f0,w0,o0,bl0,d0)
LISTEN 0 32768 :::32003 :::* users:(("kube-proxy",pid=3250,fd=13))
skmem:(r0,rb12582912,t0,tb12582912,f0,w0,o0,bl0,d0)
LISTEN 0 32768 :::31719 :::* users:(("kube-proxy",pid=3250,fd=12))
skmem:(r0,rb12582912,t0,tb12582912,f0,w0,o0,bl0,d0)
LISTEN 0 32768 :::10250 :::* users:(("kubelet",pid=2607,fd=24))
skmem:(r0,rb12582912,t0,tb12582912,f0,w0,o0,bl0,d23)
LISTEN 0 32768 :::9100 :::* users:(("node_exporter",pid=11027,fd=3))
skmem:(r0,rb12582912,t0,tb12582912,f0,w0,o0,bl0,d0)
LISTEN 0 32768 :::31532 :::* users:(("kube-proxy",pid=3250,fd=11))
skmem:(r0,rb12582912,t0,tb12582912,f0,w0,o0,bl0,d0)
LISTEN 0 32768 :::30892 :::* users:(("kube-proxy",pid=3250,fd=9))
skmem:(r0,rb12582912,t0,tb12582912,f0,w0,o0,bl0,d0)
LISTEN 0 32768 :::10255 :::* users:(("kubelet",pid=2607,fd=26))
skmem:(r0,rb12582912,t0,tb12582912,f0,w0,o0,bl0,d0)
LISTEN 0 128 :::sunrpc :::* users:(("rpcbind",pid=232,fd=11))
skmem:(r0,rb12582912,t0,tb12582912,f0,w0,o0,bl0,d0)
LISTEN 0 32768 :::10256 :::* users:(("kube-proxy",pid=3250,fd=8))
skmem:(r0,rb12582912,t0,tb12582912,f0,w0,o0,bl0,d0)
TIME-WAIT 0 0 ::ffff:172.18.25.47:9100 ::ffff:172.18.25.47:13492
ESTAB 0 0 ::ffff:172.18.25.47:10250 ::ffff:172.18.25.148:55670 users:(("kubelet",pid=2607,fd=40))
skmem:(r0,rb12582912,t0,tb12582912,f0,w0,o0,bl0,d15400)
TIME-WAIT 0 0 ::ffff:172.18.25.47:9100 ::ffff:172.18.25.47:13096
TIME-WAIT 0 0 ::ffff:172.18.25.47:9100 ::ffff:172.18.25.47:13384
ESTAB 0 0 ::ffff:172.18.25.47:10250 ::ffff:172.18.44.34:49454 users:(("kubelet",pid=2607,fd=59))
skmem:(r0,rb12582912,t0,tb12582912,f0,w0,o0,bl0,d7698)
TIME-WAIT 0 0 ::ffff:172.18.25.47:9100 ::ffff:172.18.25.47:13200
TIME-WAIT 0 0 ::ffff:172.18.25.47:9100 ::ffff:172.18.25.47:13502
TIME-WAIT 0 0 ::ffff:172.18.25.47:4000 ::ffff:172.18.63.198:25438
TIME-WAIT 0 0 ::ffff:172.18.25.47:9100 ::ffff:172.18.25.47:13586
TIME-WAIT 0 0 ::ffff:172.18.25.47:9100 ::ffff:172.18.25.47:13298
ESTAB 0 0 ::ffff:172.18.25.47:9100 ::ffff:172.18.25.148:45776 users:(("node_exporter",pid=11027,fd=7))
skmem:(r0,rb12582912,t0,tb12582912,f0,w0,o0,bl0,d15419)
TIME-WAIT 0 0 ::ffff:172.18.25.47:9100 ::ffff:172.18.25.47:13292
ESTAB 0 0 ::ffff:127.0.0.1:10255 ::ffff:127.0.0.1:32888 users:(("kubelet",pid=2607,fd=5))
skmem:(r0,rb12582912,t0,tb12582912,f4096,w0,o0,bl0,d0)
TIME-WAIT 0 0 ::ffff:172.18.25.47:9100 ::ffff:172.18.25.47:13206
ESTAB 0 0 ::ffff:172.18.25.47:10250 ::ffff:172.18.18.165:33482 users:(("kubelet",pid=2607,fd=32))
skmem:(r0,rb12582912,t0,tb12582912,f0,w0,o0,bl0,d7707)
TIME-WAIT 0 0 ::ffff:172.18.25.47:4000 ::ffff:172.18.30.39:45200
TIME-WAIT 0 0 ::ffff:172.18.25.47:9100 ::ffff:172.18.25.47:13594
TIME-WAIT 0 0 ::ffff:172.18.25.47:9100 ::ffff:172.18.25.47:13390
TIME-WAIT 0 0 ::ffff:172.18.25.47:9100 ::ffff:172.18.25.47:13090
ESTAB 0 0 ::ffff:172.18.25.47:10250 ::ffff:172.18.25.148:55590 users:(("kubelet",pid=2607,fd=41))
skmem:(r0,rb12582912,t0,tb12582912,f0,w0,o0,bl0,d15418)
ESTAB 0 0 ::ffff:172.18.25.47:10250 ::ffff:172.18.25.148:55536 users:(("kubelet",pid=2607,fd=11))
skmem:(r0,rb12582912,t0,tb12582912,f0,w0,o0,bl0,d15401)
ESTAB 0 0 ::ffff:172.18.25.47:10250 ::ffff:172.18.25.148:55762 users:(("kubelet",pid=2607,fd=43))
skmem:(r0,rb12582912,t0,tb12582912,f0,w0,o0,bl0,d15407)
根据ss
手册:
skmem:(r<rmem_alloc>,rb<rcv_buf>,t<wmem_alloc>,tb<snd_buf>,
f<fwd_alloc>,w<wmem_queued>,o<opt_mem>,
bl<back_log>,d<sock_drop>)
<rmem_alloc>
the memory allocated for receiving packet
<rcv_buf>
the total memory can be allocated for receiving packet
<wmem_alloc>
the memory used for sending packet (which has been sent
to layer 3)
<snd_buf>
the total memory can be allocated for sending packet
<fwd_alloc>
the memory allocated by the socket as cache, but not used
for receiving/sending packet yet. If need memory to
send/receive packet, the memory in this cache will be
used before allocate additional memory.
<wmem_queued>
The memory allocated for sending packet (which has not
been sent to layer 3)
<ropt_mem>
The memory used for storing socket option, e.g., the key
for TCP MD5 signature
<back_log>
The memory used for the sk backlog queue. On a process
context, if the process is receiving packet, and a new
packet is received, it will be put into the sk backlog
queue, so it can be received by the process immediately
<sock_drop>
the number of packets dropped before they are de-multi‐
plexed into the socket
skmem
添加除 和 之外的rb
所有值tb
(因为它们是可以分配的最大金额)对于d
丢弃的数据包,我应该得到一个非常接近值的值/proc/net/sockstat
。然而我得到的价值是53k这与 17804k 相差甚远。
我的逻辑正确吗?那么我在这里缺少什么?
答案1
经过一番查找,我终于得出了一个结论。
我对如何计算TCP内存使用量的理解是正确的。
对于每个套接字添加socket_memory = rmem_alloc + wmem_alloc + fwd_alloc + wmem_queued + opt_mem + back_log
(r
、t
、f
、w
、bl
、o
中的字段skmem)
我上面捕获的总套接字内存没有加起来的原因是,很多连接都在 docker 容器内运行,不显示在主系统ss
输出中,但它们显示在 /proc/net/sockstat 的内核输出中。
更多信息请参见这个有用的 stackoverflow 问题:https://stackoverflow.com/questions/37171909/when-using-docker-builted-connections-dont-appear-in-netstat
这解释了其中的差异。对于仅主机运行的进程情况,内存总和将是匹配的。