我有一个非常棘手的问题,我无法解决这个问题。我的目标是通过将默认网关放在单独的表中(以便以后可以在多个表中拥有多个默认路由)并根据策略进行选择,将主路由表与默认网关分开。
对于这个简化的示例,我只有两个接口:eth0.2 是 WAN(具有公共 IP 的 DSL),eth0.3 是 LAN:
$ ip -4 addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
3: vrf_sonic: <NOARP,MASTER,UP,LOWER_UP> mtu 65575 qdisc noqueue state UP group default qlen 1000
inet 127.0.0.1/8 brd 127.255.255.255 scope host vrf_sonic
valid_lft forever preferred_lft forever
5: eth0.2@eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master vrf_sonic state UP group default qlen 1000
inet 192.184.144.21/21 brd 192.184.151.255 scope global dynamic eth0.2
valid_lft 19730sec preferred_lft 19730sec
6: eth0.3@eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
inet 10.227.79.2/24 brd 10.227.79.255 scope global eth0.3
valid_lft forever preferred_lft forever
由于 eth0.2 上的默认路由是通过 DHCP 动态分配的,因此我在接口/DHCP 实例上使用 VRF。结果是我的默认路由位于表 170 中:
$ sudo ip vrf
Name Table
-----------------------
vrf_sonic 170
$ sudo ip route show table 170
default nhid 38 via 192.184.144.1 dev eth0.2 proto static metric 20
broadcast 127.0.0.0 dev vrf_sonic proto kernel scope link src 127.0.0.1
127.0.0.0/8 dev vrf_sonic proto kernel scope link src 127.0.0.1
local 127.0.0.1 dev vrf_sonic proto kernel scope host src 127.0.0.1
broadcast 127.255.255.255 dev vrf_sonic proto kernel scope link src 127.0.0.1
broadcast 192.184.144.0 dev eth0.2 proto kernel scope link src 192.184.144.21
192.184.144.0/21 dev eth0.2 proto kernel scope link src 192.184.144.21
local 192.184.144.21 dev eth0.2 proto kernel scope host src 192.184.144.21
broadcast 192.184.151.255 dev eth0.2 proto kernel scope link src 192.184.144.21
我的主表将包含大量 OSPF 路由,但目前只有一项:
$ sudo ip route show table main
10.227.79.0/24 dev eth0.3 proto kernel scope link src 10.227.79.2
现在我的规则如下所示:
$ sudo ip rule
101: from all lookup local
102: from all lookup main
104: from all lookup vrf_sonic
1000: from all lookup [l3mdev-table]
2000: from all lookup [l3mdev-table] unreachable
首先,检查本地表(=直接连接的接口)。然后是主路由表。然后 vrf_sonic 包含默认路由(基本上是包罗万象的)。规则 1000/2000 会自动插入(由于 VRF),但由于 104 包罗万象的规则,切勿使用。
防火墙表全部处于 ACCEPT 状态,mangle 表为空,nat 表也为空。
到目前为止,一切都很好。 ICMP (ping) 工作原理:
$ ping 142.250.189.164
PING 142.250.189.164 56(84) bytes of data.
64 bytes from 142.250.189.164: icmp_seq=1 ttl=113 time=4.78 ms
64 bytes from 142.250.189.164: icmp_seq=2 ttl=113 time=4.59 ms
^C
--- 142.250.189.164 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 4ms
rtt min/avg/max/mdev = 4.589/4.683/4.778/0.116 ms
$
这意味着,正如预期的那样,通过规则 104 选择表 170 即可采用正确的默认路由。
现在是疯狂的部分:
$ telnet 142.250.189.164 80
Trying 142.250.189.164...
telnet: Unable to connect to remote host: Network is unreachable
$
这是 WTF#1。但还有一个更大的WTF:
$ sudo ip vrf exec vrf_sonic /bin/telnet 142.250.189.164 80
Trying 142.250.189.164...
Connected to 142.250.189.164.
Escape character is '^]'.
GET / HTTP/1.0
[...]
stener("click",G)});}).call(this);</script></body></html>Connection closed by foreign host.
$
这意味着在 vrf_sonic VRF 的上下文中它可以工作。最后,我与(失败的)telnet 并行地启动 tcpdump:
$ sudo tcpdump -n -i eth0.2 'host 142.250.189.164'
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0.2, link-type EN10MB (Ethernet), capture size 262144 bytes
23:34:18.875675 IP 192.184.144.21.55124 > 142.250.189.164.80: Flags [S], seq 1179612779, win 64240, options [mss 1460,sackOK,TS val 333932630 ecr 0,nop,wscale 7], length 0
23:34:18.879471 IP 142.250.189.164.80 > 192.184.144.21.55124: Flags [S.], seq 3829270116, ack 1179612780, win 65535, options [mss 1412,sackOK,TS val 146008037 ecr 333932630,nop,wscale 8], length 0
23:34:18.879573 IP 192.184.144.21.55124 > 142.250.189.164.80: Flags [R], seq 1179612780, win 0, length 0
23:34:19.888499 IP 192.184.144.21.55124 > 142.250.189.164.80: Flags [S], seq 1179612779, win 64240, options [mss 1460,sackOK,TS val 333933643 ecr 0,nop,wscale 7], length 0
23:34:19.896597 IP 142.250.189.164.80 > 192.184.144.21.55124: Flags [S.], seq 3845107556, ack 1179612780, win 65535, options [mss 1412,sackOK,TS val 146009050 ecr 333933643,nop,wscale 8], length 0
23:34:19.896719 IP 192.184.144.21.55124 > 142.250.189.164.80: Flags [R], seq 11779612780, win 0, length 0
23:34:21.935744 IP 192.184.144.21.55124 > 142.250.189.164.80: Flags [S], seq 117], length 0n 64240, options [mss 1460,sackOK,TS val 333935690 ecr 0,nop,wscale 7
23:34:21.947596 IP 142.250.189.164.80 > 192.184.144.21.55124: Flags [S.], seq 3ecr 333935690,nop,wscale 8], length 0 options [mss 1412,sackOK,TS val 146011097 e
23:34:21.947717 IP 192.184.144.21.55124 > 142.250.189.164.80: Flags [R], seq 1179612780, win 0, length 0
^C
9 packets captured
9 packets received by filter
0 packets dropped by kernel
$
我们的主机发送 SYN,google 主机回复 SYN-ACK,然后我们的主人发送... RST?!我不得不揉眼睛几次才能正确地看到这一点。搞什么#3??
这是不可思议的。怎么会这样?这里发生了什么?