我在运行 Ubuntu 18.04.5 的托管服务提供商处设置了 VPS,最近遇到了间歇性连接被拒绝的问题。
连接可能在一段时间内都运行正常,但在某些时候,大约三分之一的连接会被拒绝。
我是一名新手服务器管理员,我已尽我所能尝试解决这个问题,但不幸的是没有太多运气。
我想知道是否可能是服务器上的一些低级资源暂时耗尽,但不知道那可能是什么或者如何解决/继续进行故障排除。
如果在服务器上运行 tcpdump 并使用 netcat 从本地机器发送请求,就会发现这个问题,例如:
我在 IP 为 196.189.91.144 的服务器上运行以下命令,以过滤来自公有 IP 为 196.188.181.138 的本地机器的端口 443 上的流量:
sudo tcpdump -i any -nn src 196.188.181.138 and port 443
我在本地机器上运行以下命令:
while true; do nc -z -v 196.189.91.144 443; sleep 1; done
接受流量时服务器上的输出通常会读取(每个请求四行),包含发送、确认、完成、确认序列:
12:46:55.760743 IP 196.188.181.138.63571 > 10.180.53.144.443: Flags [S], seq 1927709909, win 65535, options [mss 1200,nop,wscale 6,nop,nop,TS val 972740071 ecr 0,sackOK,eol], length 0
12:46:55.854751 IP 196.188.181.138.63571 > 10.180.53.144.443: Flags [.], ack 612994738, win 2060, options [nop,nop,TS val 972740101 ecr 1086816579], length 0
12:46:55.857747 IP 196.188.181.138.63571 > 10.180.53.144.443: Flags [F.], seq 0, ack 1, win 2060, options [nop,nop,TS val 972740103 ecr 1086816579], length 0
12:46:55.896751 IP 196.188.181.138.63571 > 10.180.53.144.443: Flags [.], ack 2, win 2060, options [nop,nop,TS val 972740204 ecr 1086816676], length 0
我的本地机器上的输出将会显示:
Connection to 196.189.91.144 port 443 [tcp/https] succeeded!
当出现连接问题时,本地机器上的输出nc: connectx to 196.189.91.144 port 443 (tcp) failed: Connection refused
和服务器上的输出都是仅带有发送标志的连续行,例如:
12:49:09.799827 IP 196.188.181.138.65191 > 10.180.53.144.443: Flags [S], seq 3362731211, win 65535, options [mss 1200,nop,wscale 6,nop,nop,TS val 972872973 ecr 0,sackOK,eol], length 0
12:49:11.080774 IP 196.188.181.138.65201 > 10.180.53.144.443: Flags [S], seq 1348134626, win 65535, options [mss 1200,nop,wscale 6,nop,nop,TS val 972873986 ecr 0,sackOK,eol], length 0
12:49:12.124781 IP 196.188.181.138.65203 > 10.180.53.144.443: Flags [S], seq 3076592770, win 65535, options [mss 1200,nop,wscale 6,nop,nop,TS val 972875232 ecr 0,sackOK,eol], length 0
如果测试连接到端口 22 而不是 443,也会出现同样的问题,所以该问题应该与我有一个处理 https 请求的反向代理无关。
我尝试过暂时禁用防火墙 ( sudo ufw disable
),但效果并不明显。防火墙的设置如下:
Status: active
To Action From
-- ------ ----
OpenSSH ALLOW Anywhere
80/tcp ALLOW Anywhere
443/tcp ALLOW Anywhere
Nginx Full ALLOW Anywhere
OpenSSH (v6) ALLOW Anywhere (v6)
80/tcp (v6) ALLOW Anywhere (v6)
443/tcp (v6) ALLOW Anywhere (v6)
Nginx Full (v6) ALLOW Anywhere (v6)
服务器也运行着fail2ban,但我没有看到我的IP地址被列入任何黑名单。
Ip表规则如下sudo iptables -L
:
Chain INPUT (policy DROP)
target prot opt source destination
f2b-sshd tcp -- anywhere anywhere multiport dports ssh
ufw-before-logging-input all -- anywhere anywhere
ufw-before-input all -- anywhere anywhere
ufw-after-input all -- anywhere anywhere
ufw-after-logging-input all -- anywhere anywhere
ufw-reject-input all -- anywhere anywhere
ufw-track-input all -- anywhere anywhere
Chain FORWARD (policy ACCEPT)
target prot opt source destination
ufw-before-logging-forward all -- anywhere anywhere
ufw-before-forward all -- anywhere anywhere
ufw-after-forward all -- anywhere anywhere
ufw-after-logging-forward all -- anywhere anywhere
ufw-reject-forward all -- anywhere anywhere
ufw-track-forward all -- anywhere anywhere
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
ufw-before-logging-output all -- anywhere anywhere
ufw-before-output all -- anywhere anywhere
ufw-after-output all -- anywhere anywhere
ufw-after-logging-output all -- anywhere anywhere
ufw-reject-output all -- anywhere anywhere
ufw-track-output all -- anywhere anywhere
Chain f2b-sshd (1 references)
target prot opt source destination
RETURN all -- anywhere anywhere
Chain ufw-after-forward (1 references)
target prot opt source destination
Chain ufw-after-input (1 references)
target prot opt source destination
ufw-skip-to-policy-input udp -- anywhere anywhere udp dpt:netbios-ns
ufw-skip-to-policy-input udp -- anywhere anywhere udp dpt:netbios-dgm
ufw-skip-to-policy-input tcp -- anywhere anywhere tcp dpt:netbios-ssn
ufw-skip-to-policy-input tcp -- anywhere anywhere tcp dpt:microsoft-ds
ufw-skip-to-policy-input udp -- anywhere anywhere udp dpt:bootps
ufw-skip-to-policy-input udp -- anywhere anywhere udp dpt:bootpc
ufw-skip-to-policy-input all -- anywhere anywhere ADDRTYPE match dst-type BROADCAST
Chain ufw-after-logging-forward (1 references)
target prot opt source destination
Chain ufw-after-logging-input (1 references)
target prot opt source destination
LOG all -- anywhere anywhere limit: avg 3/min burst 10 LOG level warning prefix "[UFW BLOCK] "
Chain ufw-after-logging-output (1 references)
target prot opt source destination
Chain ufw-after-output (1 references)
target prot opt source destination
Chain ufw-before-forward (1 references)
target prot opt source destination
ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED
ACCEPT icmp -- anywhere anywhere icmp destination-unreachable
ACCEPT icmp -- anywhere anywhere icmp time-exceeded
ACCEPT icmp -- anywhere anywhere icmp parameter-problem
ACCEPT icmp -- anywhere anywhere icmp echo-request
ufw-user-forward all -- anywhere anywhere
Chain ufw-before-input (1 references)
target prot opt source destination
ACCEPT all -- anywhere anywhere
ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED
ufw-logging-deny all -- anywhere anywhere ctstate INVALID
DROP all -- anywhere anywhere ctstate INVALID
ACCEPT icmp -- anywhere anywhere icmp destination-unreachable
ACCEPT icmp -- anywhere anywhere icmp time-exceeded
ACCEPT icmp -- anywhere anywhere icmp parameter-problem
ACCEPT icmp -- anywhere anywhere icmp echo-request
ACCEPT udp -- anywhere anywhere udp spt:bootps dpt:bootpc
ufw-not-local all -- anywhere anywhere
ACCEPT udp -- anywhere 224.0.0.251 udp dpt:mdns
ACCEPT udp -- anywhere 239.255.255.250 udp dpt:1900
ufw-user-input all -- anywhere anywhere
Chain ufw-before-logging-forward (1 references)
target prot opt source destination
Chain ufw-before-logging-input (1 references)
target prot opt source destination
Chain ufw-before-logging-output (1 references)
target prot opt source destination
Chain ufw-before-output (1 references)
target prot opt source destination
ACCEPT all -- anywhere anywhere
ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED
ufw-user-output all -- anywhere anywhere
Chain ufw-logging-allow (0 references)
target prot opt source destination
LOG all -- anywhere anywhere limit: avg 3/min burst 10 LOG level warning prefix "[UFW ALLOW] "
Chain ufw-logging-deny (2 references)
target prot opt source destination
RETURN all -- anywhere anywhere ctstate INVALID limit: avg 3/min burst 10
LOG all -- anywhere anywhere limit: avg 3/min burst 10 LOG level warning prefix "[UFW BLOCK] "
Chain ufw-not-local (1 references)
target prot opt source destination
RETURN all -- anywhere anywhere ADDRTYPE match dst-type LOCAL
RETURN all -- anywhere anywhere ADDRTYPE match dst-type MULTICAST
RETURN all -- anywhere anywhere ADDRTYPE match dst-type BROADCAST
ufw-logging-deny all -- anywhere anywhere limit: avg 3/min burst 10
DROP all -- anywhere anywhere
Chain ufw-reject-forward (1 references)
target prot opt source destination
Chain ufw-reject-input (1 references)
target prot opt source destination
Chain ufw-reject-output (1 references)
target prot opt source destination
Chain ufw-skip-to-policy-forward (0 references)
target prot opt source destination
ACCEPT all -- anywhere anywhere
Chain ufw-skip-to-policy-input (7 references)
target prot opt source destination
DROP all -- anywhere anywhere
Chain ufw-skip-to-policy-output (0 references)
target prot opt source destination
ACCEPT all -- anywhere anywhere
Chain ufw-track-forward (1 references)
target prot opt source destination
ACCEPT tcp -- anywhere anywhere ctstate NEW
ACCEPT udp -- anywhere anywhere ctstate NEW
Chain ufw-track-input (1 references)
target prot opt source destination
Chain ufw-track-output (1 references)
target prot opt source destination
ACCEPT tcp -- anywhere anywhere ctstate NEW
ACCEPT udp -- anywhere anywhere ctstate NEW
Chain ufw-user-forward (1 references)
target prot opt source destination
Chain ufw-user-input (1 references)
target prot opt source destination
ACCEPT tcp -- anywhere anywhere tcp dpt:ssh /* 'dapp_OpenSSH' */
ACCEPT tcp -- anywhere anywhere tcp dpt:http
ACCEPT tcp -- anywhere anywhere tcp dpt:https
ACCEPT tcp -- anywhere anywhere multiport dports http,https /* 'dapp_Nginx%20Full' */
Chain ufw-user-limit (0 references)
target prot opt source destination
LOG all -- anywhere anywhere limit: avg 3/min burst 5 LOG level warning prefix "[UFW LIMIT BLOCK] "
REJECT all -- anywhere anywhere reject-with icmp-port-unreachable
Chain ufw-user-limit-accept (0 references)
target prot opt source destination
ACCEPT all -- anywhere anywhere
Chain ufw-user-logging-forward (0 references)
target prot opt source destination
Chain ufw-user-logging-input (0 references)
target prot opt source destination
Chain ufw-user-logging-output (0 references)
target prot opt source destination
Chain ufw-user-output (1 references)
target prot opt source destination
该服务器的 RAM 容量相当不足(内存使用率通常在 50% 左右),但我猜这不应该成为问题。landscape-sysinfo 输出为:
System load: 1.02 Processes: 136
Usage of /: 41.0% of 28.47GB Users logged in: 1
Memory usage: 47% IP address for ens3: 10.180.53.144
Swap usage: 0%
托管位于发展中国家,我的经验是很难联系服务提供商和/或获得合格的帮助,因此我转向社区并希望有人能为我指明正确的方向,因为我对如何排除故障的想法已经用尽。谢谢!