升级/重新启动 KVM 主机后,KVM 客户机上的 HTTP 端口无法访问

升级/重新启动 KVM 主机后,KVM 客户机上的 HTTP 端口无法访问

问题陈述:升级/重启 KVM 主机后,无法通过 HTTP 访问 gitlab VM。

拓扑: 拓扑

故障排除说明:

gitlab VM 可通过 SSH 从拓扑中的所有主机访问。gitlab VM 可通过 HTTP 从其自身以及 KVM 主机 (wspbm) 上的多个接口访问,但不能从拓扑中的 PC 访问。IPTABLES 规则已添加到 INPUT 和 FORWARD 链以允许一切,没有变化。

从 PC SSH 到 gitlab VM:

[root@los-alamos]$ ssh gitlab ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 52:54:00:52:92:86 brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.86/24 brd 192.168.1.255 scope global ens3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fe52:9286/64 scope link 
       valid_lft forever preferred_lft forever
[root@los-alamos]$ 

来自 KVM 主机 wspbm 的 Netcat 命令:

[root@wspbm]# nc -s 192.168.1.17 192.168.1.86 80
get
HTTP/1.1 400 Bad Request
Server: nginx
Date: Sat, 11 Aug 2018 14:13:36 GMT
Content-Type: text/html
Content-Length: 166
Connection: close

<html>
<head><title>400 Bad Request</title></head>
<body bgcolor="white">
<center><h1>400 Bad Request</h1></center>
<hr><center>nginx</center>
</body>
</html>
[root@wspbm]# nc -s 192.168.5.1 gitlab 80
get
HTTP/1.1 400 Bad Request
Server: nginx
Date: Sat, 11 Aug 2018 14:13:41 GMT
Content-Type: text/html
Content-Length: 166
Connection: close

<html>
<head><title>400 Bad Request</title></head>
<body bgcolor="white">
<center><h1>400 Bad Request</h1></center>
<hr><center>nginx</center>
</body>
</html>
[root@wspbm]#

从 gitlab 主机本身进行 Telnet:

root@gitlab:~# telnet 192.168.1.86 80
Trying 192.168.1.86...
Connected to 192.168.1.86.
Escape character is '^]'.
get
HTTP/1.1 400 Bad Request
Server: nginx
Date: Sat, 11 Aug 2018 16:32:27 GMT
Content-Type: text/html
Content-Length: 166
Connection: close

<html>
<head><title>400 Bad Request</title></head>
<body bgcolor="white">
<center><h1>400 Bad Request</h1></center>
<hr><center>nginx</center>
</body>
</html>
Connection closed by foreign host.
root@gitlab:~# 

尽管我们得到了 400 多分,但这证明应用程序已启动并正在响应。

运行 tcpdump 显示 kvm 主机 wspbm 看到来自 gitlab kvm 客户机的重置:

[root@wspbm]# date ; tcpdump -vvv -i any host 192.168.1.86 and host 192.168.1.33 and port 80
Sat Aug 11 10:07:45 MDT 2018
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
10:07:47.940406 IP (tos 0x10, ttl 64, id 3351, offset 0, flags [DF], proto TCP (6), length 60)
    192.168.1.33.39392 > gitlab.wsp.local.http: Flags [S], cksum 0x49b5 (correct), seq 475015689, win 29200, options [mss 1460,sackOK,TS val 13313822 ecr 0,nop,wscale 7], length 0
10:07:47.940492 IP (tos 0x10, ttl 64, id 36097, offset 0, flags [DF], proto TCP (6), length 40)
    **gitlab.wsp.local.http > 192.168.1.33.39392: Flags [R.], cksum 0x4b7e (correct), seq 0, ack 475015690, win 0, length 0
10:07:47.940501 IP (tos 0x10, ttl 64, id 36097, offset 0, flags [DF], proto TCP (6), length 40)**
    gitlab.wsp.local.http > 192.168.1.33.39392: Flags [R.], cksum 0x4b7e (correct), seq 0, ack 1, win 0, length 0 (NOT SURE WHY A SECOND RESET IS SEEN)
^C
3 packets captured
3 packets received by filter
0 packets dropped by kernel
[root@wspbm]# 

提供 SYN 的主机只会看到一个重置:

root@los-alamos:~# date ; tcpdump -vvv -i enp3s0 host 192.168.1.86 and port 80
Sat Aug 11 10:07:47 MDT 2018
tcpdump: listening on enp3s0, link-type EN10MB (Ethernet), capture size 262144 bytes
10:07:47.919351 IP (tos 0x10, ttl 64, id 3351, offset 0, flags [DF], proto TCP (6), length 60)
    192.168.1.33.39392 > gitlab.wsp.local.http: Flags [S], cksum 0x49b5 (correct), seq 475015689, win 29200, options [mss 1460,sackOK,TS val 13313822 ecr 0,nop,wscale 7], length 0
10:07:47.919838 IP (tos 0x10, ttl 64, id 36097, offset 0, flags [DF], proto TCP (6), length 40)
    **gitlab.wsp.local.http > 192.168.1.33.39392: Flags [R.], cksum 0x4b7e (correct), seq 0, ack 475015690, win 0, length 0**
^C
2 packets captured
2 packets received by filter
0 packets dropped by kernel
root@los-alamos:~# 

但是 gitlab kvm guest 从未看到任何流量:

root@gitlab:~# date ; tcpdump -vvv -i ens3 host 192.168.1.33 and dst port 80
Sat Aug 11 10:07:46 MDT 2018
tcpdump: listening on ens3, link-type EN10MB (Ethernet), capture size 262144 bytes
^C
0 packets captured
0 packets received by filter
0 packets dropped by kernel
root@gitlab:~# 

在 INPUT 和 FORWARD iptables 链中添加全面接受规则后,上述结果相同:

[root@wspbm]# iptables -L | egrep '(INPUT|FORWARD)' -A 2
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         
ACCEPT     all  --  anywhere             anywhere            
--
Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         
ACCEPT     all  --  anywhere             anywhere            
[root@wspbm]# 

提前感谢任何建议。我很乐意提供任何其他输出。

编辑:8/16/18

我在 KVM 主机操作系统(不在 gitlab VM 内)上启动了一个 nginx docker 容器,监听端口 80,仅用于测试。可以从 KVM 主机访问该服务:

[root@wspbm]# docker run --name nginx -d -p 80:80 nginx
f8e91cfec019e42354b4c3d7dac09947bb3b7f6ba6f75c2965b1524d6dc69e4a
[root@wspbm]# telnet 192.168.1.17 80
Trying 192.168.1.17...
Connected to 192.168.1.17.
Escape character is '^]'.
^]
telnet> quit
Connection closed.
[root@wspbm]# 

但无法从拓扑中的桌面访问:

[root@los-alamos]$ telnet 192.168.1.17 80
Trying 192.168.1.17...
telnet: Unable to connect to remote host: Connection refused
[root@los-alamos]$ 

编辑:8/17/18

我将 gitlab VM 上使用的端口从 80 更改为 8888,现在可以从任何地方访问该服务。

Gitlab 正在监听 8888:

root@gitlab:~# lsof -i :8888
COMMAND  PID       USER   FD   TYPE  DEVICE SIZE/OFF NODE NAME
nginx   3255       root    7u  IPv4 5570784      0t0  TCP *:8888 (LISTEN)
nginx   3256 gitlab-www    7u  IPv4 5570784      0t0  TCP *:8888 (LISTEN)
nginx   3257 gitlab-www    7u  IPv4 5570784      0t0  TCP *:8888 (LISTEN)
root@gitlab:~# 

从桌面尝试连接:

root@los-alamos:~# telnet gitlab 8888
Trying 192.168.1.86...
Connected to gitlab.wsp.local.
Escape character is '^]'.
get
HTTP/1.1 400 Bad Request
Server: nginx
Date: Fri, 17 Aug 2018 11:36:42 GMT
Content-Type: text/html
Content-Length: 166
Connection: close

<html>
<head><title>400 Bad Request</title></head>
<body bgcolor="white">
<center><h1>400 Bad Request</h1></center>
<hr><center>nginx</center>
</body>
</html>
Connection closed by foreign host.
root@los-alamos:~# 

虽然这解除了我正在从事的副项目的阻碍,但我觉得这并不能真正解决这里的潜在问题。如果有人对如何使虚拟机上的端口 80 可访问有任何建议,我很乐意保留此问题并继续进行故障排除,以防该解决方案将来对其他人有用。

答案1

搞清楚这里发生了什么。iptables 中的重定向发生在 nat 表中:

[root@wspbm]# iptables -L --line-numbers -t nat
Chain PREROUTING (policy ACCEPT)
num  target     prot opt source               destination         
1    REDIRECT   tcp  --  anywhere             anywhere             tcp dpt:http redir ports 3129

删除上述规则后,一切开始正常工作。我认为这里发生的事情是,使用 squid 的附带项目工作被搁置,规则被删除,但 iptables-persist 并没有忘记我的重定向规则,所以我的重启让我很头疼!

[root@wspbm]# cat /etc/iptables/rules.v4 | grep 3129
-A PREROUTING -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 3129
[root@wspbm]#

为了解决这个问题,删除了有效规则:

[root@wspbm]# iptables -t nat -D PREROUTING -p tcp --dport 80 -j REDIRECT --to 3129
[root@wspbm]# 

然后使用活动规则更新存储的规则,现在该规则不包含有问题的重定向规则:

[root@wspbm]# cat /etc/iptables/rules.v4 | grep 3129
-A PREROUTING -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 3129
[root@wspbm]# cd /etc/iptables/
[root@wspbm]# cp rules.v4 rules.v4.08212018
[root@wspbm]# iptables-save > rules.v4
[root@wspbm]# cat rules.v4 | grep 3129
[root@wspbm]# 

所有连接现在均按预期运行。将此问题标记为 PEBCAK。

相关内容