首先,这是我的基础设施的样子和工作原理:
Controller1/2 和 Compute1/2 都运行 VM,并通过 VPN 相互连接。在每台服务器上,br-ext 接口都插入 ext 接口(vpn 接口)。所有服务器都能够相互通信,VM 也能够在其私有接口上进行通信。
我有两个 ubuntu 16.04 路由器(带有 ETH3 和 BR-ext 的 2 个盒子),一次只有一个处于活动状态(第二个是使用 keepalived 的故障转移)并且同时拥有公共子网(51.38.XY/27)和 IP 10.38.166.190(充当所有 VM 的网关)。
我使用 Iptables 和 Iproute2 来允许流量(比如说 51.38.X.YYA)到达 10.38.X.YYA,以及从 10.38.X.YYA 经过 51.38.X.YYA。
从其中一个虚拟机,我可以毫无问题地访问外部,如果我运行 curl ifconfig.co,系统会提示我公共 IP,这是我想要的行为。
我的问题:
如果我尝试使用其公共 IP 从 VM1 访问 VM2,则根本行不通。
我将使用两台虚拟机来说明我的问题并提供有关它的所有配置:
虚拟机1:10.38.166.167/51.38.166.167 虚拟机2:10.38.166.166/51.38.166.166
我目前所做的:
在 router1 上:
ETH1 = 主接口(管理) ETH3 = 包含所有 IP 和 NAT 到 VM 的接口 br-ext = 包含 VPN 接口的桥 ext = VPN 接口(插在桥 br-ext 上)
[root@network3] ~# ip a l
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether fa:16:3e:19:3e:41 brd ff:ff:ff:ff:ff:ff
inet 51.38.166.162/32 brd 51.38.x.162 scope global eth1
valid_lft forever preferred_lft forever
inet6 fe80::f816:3eff:fe19:3e41/64 scope link
valid_lft forever preferred_lft forever
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether fa:16:3e:72:94:cb brd ff:ff:ff:ff:ff:ff
inet 51.38.166.163/32 brd 51.38.x.163 scope global eth3
valid_lft forever preferred_lft forever
inet 51.38.166.166/32 scope global eth3
valid_lft forever preferred_lft forever
inet 51.38.166.167/32 scope global eth3
valid_lft forever preferred_lft forever
7: br-ext: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether d2:f8:64:36:64:f2 brd ff:ff:ff:ff:ff:ff
inet 10.0.0.103/9 brd 10.127.255.255 scope global br-ext
valid_lft forever preferred_lft forever
inet 10.0.0.120/32 scope global br-ext
valid_lft forever preferred_lft forever
inet 10.38.166.190/32 scope global br-ext
valid_lft forever preferred_lft forever
10: ext: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br-ext state UNKNOWN group default qlen 1000
link/ether d2:f8:64:36:64:f2 brd ff:ff:ff:ff:ff:ff
我已经设置了一组路由,以允许来自 51.38.x.160/27 外部的数据包路由到 10.38.xy/27
[root@network3] ~# ip ru l | grep "lookup 103"
9997: from 10.38.x.167 lookup 103
9998: from 10.38.x.166 lookup 103
# rules to tells that each IP of the /27 need to use table 103
10301: from 51.38.166.163 lookup 103
10302: from all to 51.38.166.163 lookup 103
10307: from 51.38.166.166 lookup 103
10308: from all to 51.38.166.166 lookup 103
10309: from 51.38.166.167 lookup 103
10310: from all to 51.38.166.167 lookup 103
[root@network3] ~# ip r s table 103
default via 51.38.166.190 dev eth3
51.38.166.160/27 dev eth3 scope link
[root@network3] ~# ip r s
default via 51.38.166.190 dev eth1 onlink
10.0.0.0/9 dev br-ext proto kernel scope link src 10.0.0.103
172.16.0.0/16 dev br-manag proto kernel scope link src 172.16.0.103
我的 iptables 如下所示:
[root@network3] ~# iptables -nvL
Chain INPUT (policy ACCEPT 21334 packets, 1015K bytes)
pkts bytes target prot opt in out source destination
91877 4376K ACCEPT icmp -- * * 0.0.0.0/0 0.0.0.0/0 /* 000 accept all icmp */
18 1564 ACCEPT all -- lo * 0.0.0.0/0 0.0.0.0/0 /* 001 accept all to lo interface */
0 0 REJECT all -- !lo * 0.0.0.0/0 127.0.0.0/8 /* 002 reject local traffic not on loopback interface */ reject-with icmp-port-unreachable
343K 123M ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 state ESTABLISHED /* 003 accept related established rules */
243 14472 ACCEPT tcp -- * * 0.0.0.0/0 0.0.0.0/0 multiport dports 1022 /* 030 allow SSH */
481M 42G ACCEPT udp -- * * 0.0.0.0/0 0.0.0.0/0 multiport dports 3210:3213 /* 031 allow VPNtunnel */
4155 241K DROP all -- eth0 * 0.0.0.0/0 0.0.0.0/0 /* 999 drop all */
Chain FORWARD (policy ACCEPT 98325 packets, 8874K bytes)
pkts bytes target prot opt in out source destination
Chain OUTPUT (policy ACCEPT 964M packets, 93G bytes)
pkts bytes target prot opt in out source destination
Iptables NAT 规则
[root@network3] ~# iptables -t nat -nvL --line
Chain PREROUTING (policy ACCEPT 156K packets, 6455K bytes)
num pkts bytes target prot opt in out source destination
31 11228 771K DNAT all -- * * 0.0.0.0/0 51.38.166.166 /* 112 NAT for 10.38.166.166 */ to:10.38.166.166
32 11624 809K DNAT all -- * * 0.0.0.0/0 51.38.166.167 /* 112 NAT for 10.38.166.167 */ to:10.38.166.167
Chain INPUT (policy ACCEPT 85077 packets, 3527K bytes)
num pkts bytes target prot opt in out source destination
Chain OUTPUT (policy ACCEPT 16505 packets, 1294K bytes)
num pkts bytes target prot opt in out source destination
Chain POSTROUTING (policy ACCEPT 105K packets, 4357K bytes)
num pkts bytes target prot opt in out source destination destination
31 17 1196 SNAT all -- * * 10.38.166.166 0.0.0.0/0 to:51.38.166.166
32 8 549 SNAT all -- * * 10.38.166.167 0.0.0.0/0 to:51.38.166.167
我还在 RAW 表中插入了一些规则来帮助我跟踪数据包:
[root@network3] ~# iptables -t raw -nvL
Chain PREROUTING (policy ACCEPT 3765 packets, 227K bytes)
pkts bytes target prot opt in out source destination
0 0 TRACE all -- * * 51.38.166.167 0.0.0.0/0
185 12988 TRACE all -- * * 0.0.0.0/0 51.38.166.167
Chain OUTPUT (policy ACCEPT 7941 packets, 837K bytes)
pkts bytes target prot opt in out source destination
从 VM1 进行测试:
ubuntu@test-1:~$ ip a l dev ens3
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc pfifo_fast state UP group default qlen 1000
link/ether fa:16:3e:51:0a:0b brd ff:ff:ff:ff:ff:ff
inet 10.38.166.167/24 brd 10.38.166.255 scope global ens3
valid_lft forever preferred_lft forever
inet6 fe80::f816:3eff:fe51:a0b/64 scope link
valid_lft forever preferred_lft forever
ubuntu@test-1:~$ curl ifconfig.co
51.38.166.167
ubuntu@test-1:~$ ping 51.38.166.166 -c 4
PING 51.38.166.166 (51.38.166.166) 56(84) bytes of data.
--- 51.38.166.166 ping statistics ---
4 packets transmitted, 0 received, 100% packet loss, time 3031ms
从 VM2 进行测试:
ubuntu@test-2:~$ ip a l dev ens3
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc pfifo_fast state UP group default qlen 1000
link/ether fa:16:3e:9d:79:ce brd ff:ff:ff:ff:ff:ff
inet 10.38.166.166/24 brd 10.38.166.255 scope global ens3
valid_lft forever preferred_lft forever
inet6 fe80::f816:3eff:fe9d:79ce/64 scope link
valid_lft forever preferred_lft forever
ubuntu@test-2:~$ curl ifconfig.co
51.38.166.166
ubuntu@test-2:~$ ping 51.38.166.167 -c 4
PING 51.38.166.167 (51.38.166.167) 56(84) bytes of data.
--- 51.38.166.167 ping statistics ---
4 packets transmitted, 0 received, 100% packet loss, time 3023ms
来自网络3的日志:
[root@network3] ~# tail -f /var/log/kern.log | grep "SRC=10.38.166.166 DST=51.38.166.167"
Jul 5 11:58:12 network3 kernel: [79540.314496] TRACE: nat:PREROUTING:rule:32 IN=br-ext OUT= MAC=de:01:31:2d:47:18:fa:16:3e:9d:79:ce:08:00 SRC=10.38.166.166 DST=51.38.166.167 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=49094 DF PROTO=ICMP TYPE=8 CODE=0 ID=4992 SEQ=57
Jul 5 11:58:13 network3 kernel: [79541.322501] TRACE: raw:PREROUTING:policy:3 IN=br-ext OUT= MAC=de:01:31:2d:47:18:fa:16:3e:9d:79:ce:08:00 SRC=10.38.166.166 DST=51.38.166.167 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=49203 DF PROTO=ICMP TYPE=8 CODE=0 ID=4992 SEQ=58
Jul 5 11:58:13 network3 kernel: [79541.322543] TRACE: mangle:PREROUTING:policy:1 IN=br-ext OUT= MAC=de:01:31:2d:47:18:fa:16:3e:9d:79:ce:08:00 SRC=10.38.166.166 DST=51.38.166.167 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=49203 DF PROTO=ICMP TYPE=8 CODE=0 ID=4992 SEQ=58
Jul 5 11:58:13 network3 kernel: [79541.322574] TRACE: nat:PREROUTING:rule:32 IN=br-ext OUT= MAC=de:01:31:2d:47:18:fa:16:3e:9d:79:ce:08:00 SRC=10.38.166.166 DST=51.38.166.167 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=49203 DF PROTO=ICMP TYPE=8 CODE=0 ID=4992 SEQ=58
Jul 5 11:58:14 network3 kernel: [79542.330582] TRACE: raw:PREROUTING:policy:3 IN=br-ext OUT= MAC=de:01:31:2d:47:18:fa:16:3e:9d:79:ce:08:00 SRC=10.38.166.166 DST=51.38.166.167 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=49367 DF PROTO=ICMP TYPE=8 CODE=0 ID=4992 SEQ=59
Jul 5 11:58:14 network3 kernel: [79542.330615] TRACE: mangle:PREROUTING:policy:1 IN=br-ext OUT= MAC=de:01:31:2d:47:18:fa:16:3e:9d:79:ce:08:00 SRC=10.38.166.166 DST=51.38.166.167 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=49367 DF PROTO=ICMP TYPE=8 CODE=0 ID=4992 SEQ=59
Jul 5 11:58:14 network3 kernel: [79542.330639] TRACE: nat:PREROUTING:rule:32 IN=br-ext OUT= MAC=de:01:31:2d:47:18:fa:16:3e:9d:79:ce:08:00 SRC=10.38.166.166 DST=51.38.166.167 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=49367 DF PROTO=ICMP TYPE=8 CODE=0 ID=4992 SEQ=59
^C
由于给定 SEQ 的 ID 不会改变,因此我可以在日志中搜索与此 ID/SEQ 相关的任何内容:
[root@network3] ~# grep "ID=49367" /var/log/kern.log
Jul 5 11:58:14 network3 kernel: [79542.330582] TRACE: raw:PREROUTING:policy:3 IN=br-ext OUT= MAC=de:01:31:2d:47:18:fa:16:3e:9d:79:ce:08:00 SRC=10.38.166.166 DST=51.38.166.167 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=49367 DF PROTO=ICMP TYPE=8 CODE=0 ID=4992 SEQ=59
Jul 5 11:58:14 network3 kernel: [79542.330615] TRACE: mangle:PREROUTING:policy:1 IN=br-ext OUT= MAC=de:01:31:2d:47:18:fa:16:3e:9d:79:ce:08:00 SRC=10.38.166.166 DST=51.38.166.167 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=49367 DF PROTO=ICMP TYPE=8 CODE=0 ID=4992 SEQ=59
Jul 5 11:58:14 network3 kernel: [79542.330639] TRACE: nat:PREROUTING:rule:32 IN=br-ext OUT= MAC=de:01:31:2d:47:18:fa:16:3e:9d:79:ce:08:00 SRC=10.38.166.166 DST=51.38.166.167 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=49367 DF PROTO=ICMP TYPE=8 CODE=0 ID=4992 SEQ=59
如果我参考此图:http://inai.de/images/nf-packet-flow.png
它似乎被困在路由决策上。(我已经排除了被困在桥接决策上的可能性,因为如果我在不涉及任何桥接的情况下做完全相同的事情,它的行为是完全相同的)。
另一种可能性是它符合 NAT 预路由规则 32 但并未应用它,但我不知道为什么。
在这种情况下,我遗漏了什么线索吗?
答案1
在路由决策时丢弃数据包的最常见原因是rp_filter
。
检查命令的输出ip route get 51.38.166.167 from 10.38.166.166 iif br-ext
。正常情况下,它应该返回有效路由。invalid cross-device link
结果意味着数据包将被丢弃rp_filter
。还要检查输出nstat -az TcpExtIPReversePathFilter
。它是此类丢弃数据包的计数器。
检查rp_filter
使用ip netconf show dev br-ext
命令的当前模式。
使用sysctl
命令来调整此参数。