使用 Linux 作为带有 nftables 的路由器 - 伪装不转发回复

我有一台 Linux 机器,我将其用作路由器。它有 5 个网络接口:在其之间路由的三个独立 LAN 和两个 WAN。目前我只有一个 WAN 作为默认路由,而另一个 WAN 实际上什么也没做;多年来我一直在尝试使用 iptables 和 ip 规则让两个 WAN 正常工作,但没有成功。

我遇到的问题是这样的:当我尝试通过 WAN 2 路由 ping 时(需要 NAT),我的 ping 从我的客户端主机发送到 Linux 计算机,然后 Linux 计算机通过 WAN 2 正确转发它,并且它看到回复返回,但随后它不会将数据包转发回我的客户端计算机。尽管进行了许多搜索和阅读相关问题,但我一直无法弄清楚为什么它没有转发回来。 (WAN 1 不需要 NAT,因为这是在外部路由器上完成的。)

几天前,我从 iptables 切换到 nftables,因为它 a) 使配置更易于阅读,b) 实际上让我可以跟踪规则评估,以便我可以看到发生了什么。有了这个,我现在觉得我有足够的时间来提出这个问题了。


table ip filter {
    chain INPUT {
            type filter hook input priority 0; policy accept;

            ip protocol icmp counter meta nftrace set 1

            # allow loopback
            iifname "lo" accept

            # allow established/related connections
            ct state {established, related} accept

            # allow ping
            ip protocol icmp accept

            # accept anything from local networks
            ip saddr {
          , # lan1
          , # routed through lan1
          , # routed through lan1
          , # lan2
          , # lan3
            } accept

            # ntp exploit protection
            udp sport ntp ct state {invalid, related, new, untracked} counter drop

            # accept SSH from anyone else
            ct state new tcp dport ssh accept

            # drop all other packets
            counter drop

    chain FORWARD {
            type filter hook forward priority 0; policy accept;

            ip protocol icmp counter meta nftrace set 1

            # drop anything to old local network
            ip daddr counter drop

            # accept all other packets
            counter accept

    chain OUTPUT {
            type filter hook output priority 0; policy accept;

            # ntp exploit protection
            udp dport ntp ct state {invalid, related, untracked} counter drop

table ip mangle {
    chain FORWARD {
            type filter hook forward priority -150; policy accept;

            ip protocol icmp counter meta nftrace set 1

    chain OUTPUT {
            type filter hook output priority -150; policy accept;

            # send replies to WAN->HERE connections via the same route as where they were initiated from
            ct state related,established meta mark set ct mark

    chain PREROUTING {
            type filter hook prerouting priority -150; policy accept;

            # trace ALL packets coming from enp6s0 (WAN 2)
            iifname enp6s0 counter meta nftrace set 1

            # send subsequent packets on forwarded connections via the same route as when they were initiated
            ct state related,established meta mark set ct mark

            # trace all packets with a packet mark
            meta mark != 0x0 counter meta nftrace set 1

            # all further processing is for new connections only - so everything else returns here
            ct state != new return

            # any new WAN->LAN connections from enp6s0 (WAN 2) go into route 3, for the initial and subsequent packets
            # the return on the end ensures we don't do any further processing, which checks outbound protocols
            iifname enp6s0 ct mark set 0x3 meta mark set 0x3 return

            # any new WAN->LAN connections from enp4s0 (WAN 1) shouldn't do further processing either
            iifname enp4s0 return

            # everything from this point onwards is for new outgoing LAN->WAN connections only

            # for testing - route specific protocols through WAN 2
            #tcp dport 443 ct mark set 0x3 meta mark set 0x3
            #tcp dport 80 ct mark set 0x3 meta mark set 0x3
            ip protocol icmp ct mark set 0x3 meta mark set 0x3 counter meta nftrace set 1

table ip nat {
    chain POSTROUTING {
            type nat hook postrouting priority 100; policy accept;

            oifname enp6s0 counter meta nftrace set 1 masquerade

ip -4 addr:(enp4s0是WAN 1,enp6s0是WAN 2,其他是LAN)

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    inet scope host lo
       valid_lft forever preferred_lft forever
2: enp4s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    inet brd scope global enp4s0
       valid_lft forever preferred_lft forever
3: enp5s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    inet brd scope global enp5s0
       valid_lft forever preferred_lft forever
4: enp6s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    inet (redacted).117/22 brd scope global enp6s0
       valid_lft forever preferred_lft forever
5: enp7s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN group default qlen 1000
    inet brd scope global enp7s0
       valid_lft forever preferred_lft forever
6: enp8s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    inet brd scope global enp8s0
       valid_lft forever preferred_lft forever

ip route

default via dev enp4s0
(redacted).0/22 dev enp6s0 proto kernel scope link src (redacted).117 metric 204 mtu 1500 dev enp8s0 proto kernel scope link src via dev enp8s0 dev enp5s0 proto kernel scope link src dev enp7s0 proto kernel scope link src linkdown dev enp4s0 proto kernel scope link src

ip route show table 3

default via (redacted).1 dev enp6s0
(redacted).1 dev enp6s0 scope link src (redacted).117 dev enp8s0 proto kernel scope link src via dev enp8s0 dev enp5s0 proto kernel scope link src dev enp7s0 proto kernel scope link src linkdown dev enp4s0 proto kernel scope link src

ip rule

0:      from all lookup local
32764:  from all fwmark 0x3 lookup 3
32765:  from (redacted).117 lookup 3
32766:  from all lookup main
32767:  from all lookup default

nft monitor trace现在有趣的是,以下是我从客户端 (Windows) PC ping 时的输出:

trace id 8e85e085 ip mangle PREROUTING packet: iif "enp8s0" ether saddr dc:9f:db:16:42:b5 ether daddr 38:ea:a7:ab:f8:bc ip saddr ip daddr ip dscp cs0 ip ecn not-ect ip ttl 127 ip id 4170 ip length 60 icmp type echo-request icmp code 0 icmp id 1 icmp sequence 779
trace id 8e85e085 ip mangle PREROUTING rule ip protocol icmp ct mark set 0x00000003 mark set 0x00000003 counter packets 0 bytes 0 nftrace set 1 (verdict continue)
trace id 8e85e085 ip mangle PREROUTING verdict continue mark 0x00000003
trace id 8e85e085 ip mangle PREROUTING mark 0x00000003
trace id 8e85e085 ip mangle FORWARD packet: iif "enp8s0" oif "enp6s0" ether saddr dc:9f:db:16:42:b5 ether daddr 38:ea:a7:ab:f8:bc ip saddr ip daddr ip dscp cs0 ip ecn not-ect ip ttl 126 ip id 4170 ip length 60 icmp type echo-request icmp code 0 icmp id 1 icmp sequence 779
trace id 8e85e085 ip mangle FORWARD rule ip protocol icmp counter packets 0 bytes 0 nftrace set 1 (verdict continue)
trace id 8e85e085 ip mangle FORWARD verdict continue mark 0x00000003
trace id 8e85e085 ip mangle FORWARD mark 0x00000003
trace id 8e85e085 ip filter FORWARD packet: iif "enp8s0" oif "enp6s0" ether saddr dc:9f:db:16:42:b5 ether daddr 38:ea:a7:ab:f8:bc ip saddr ip daddr ip dscp cs0 ip ecn not-ect ip ttl 126 ip id 4170 ip length 60 icmp type echo-request icmp code 0 icmp id 1 icmp sequence 779
trace id 8e85e085 ip filter FORWARD rule ip protocol icmp counter packets 0 bytes 0 nftrace set 1 (verdict continue)
trace id 8e85e085 ip filter FORWARD rule counter packets 8 bytes 452 accept (verdict accept)
trace id 8e85e085 ip nat POSTROUTING packet: oif "enp6s0" ip saddr ip daddr ip dscp cs0 ip ecn not-ect ip ttl 126 ip id 4170 ip length 60 icmp type echo-request icmp code 0 icmp id 1 icmp sequence 779
trace id 8e85e085 ip nat POSTROUTING rule oifname "enp6s0" counter packets 0 bytes 0 nftrace set 1 masquerade (verdict accept)
trace id eae785df ip mangle PREROUTING packet: iif "enp6s0" ether saddr 00:01:5c:86:1a:47 ether daddr 00:e0:4c:68:12:d9 ip saddr ip daddr (redacted).117 ip dscp cs0 ip ecn not-ect ip ttl 56 ip id 39719 ip length 60 icmp type echo-reply icmp code 0 icmp id 1 icmp sequence 779
trace id eae785df ip mangle PREROUTING rule iifname "enp6s0" counter packets 0 bytes 0 nftrace set 1 (verdict continue)
trace id eae785df ip mangle PREROUTING rule ct state established,related mark set ct mark (verdict continue)
trace id eae785df ip mangle PREROUTING rule mark != 0x00000000 counter packets 0 bytes 0 nftrace set 1 (verdict continue)
trace id eae785df ip mangle PREROUTING verdict return mark 0x00000003
trace id eae785df ip mangle PREROUTING mark 0x00000003
trace id eae785df ip filter INPUT packet: iif "enp6s0" ether saddr 00:01:5c:86:1a:47 ether daddr 00:e0:4c:68:12:d9 ip saddr ip daddr (redacted).117 ip dscp cs0 ip ecn not-ect ip ttl 56 ip id 39719 ip length 60 icmp type echo-reply icmp code 0 icmp id 1 icmp sequence 779
trace id eae785df ip filter INPUT rule ip protocol icmp counter packets 0 bytes 0 nftrace set 1 (verdict continue)
trace id eae785df ip filter INPUT rule ct state { } accept (verdict accept)

这是输出中的相关行conntrack -L

icmp     1 15 src= dst= type=8 code=0 id=1 src= dst=(redacted).117 type=0 code=0 id=1 mark=3 use=1

出站部分具有我客户端的本地 IP 源和我正在 ping 的外部服务器的目的地,但入站部分具有执行转发的计算机的外部 IP,而不是我客户端的本地 IP。 (我不确定这是否表明存在问题。)

如您所见,echo-r​​equest 数据包正确地将数据包标记和 conntrack 标记设置为 3,然后根据 ip 规则和路由表 3 选择正确的输出接口,然后正确伪装,并清楚地到达自从我收到回显回复以来,我就上网了。 echo-r​​eply数据包正确地将conntrack标记(仍然是3)复制到数据包标记...但是正如您所看到的,它并没有反转最初执行的NAT,因此它进入了INPUT链,而不是被转发回我的客户端电脑。

我确信我错过了一些东西 - 我觉得必须有一个规则告诉它反转 NAT 操作 - 但我看到的每个页面都解释了如何从 LAN->WAN 进行 NAT 说唯一的规则您需要的是初始出站数据包的后路由上的伪装规则(许多指南提供了其他规则,例如入站连接的端口转发,但这些与简单的出站连接无关)。



nftables wiki 指出:

“[...] 即使您没有规则,您也必须注册预路由/后路由链,因为这些链将为来自回复方向的数据包调用 NAT 引擎。”在https://wiki.nftables.org/wiki-nftables/index.php/Performing_Network_Address_Translation_(NAT)

您似乎有一个过滤器类型的预路由链,但不是 nat 类型。尝试添加chain PREROUTING { type nat hook prerouting priority -150 ; } 到文件table ip nat { [...] }中的部分/etc/nftable.conf


我认为问题在于你的 nat postrouting 链的优先级为 -100。根据nftables 维基,iptables 中的 DNAT 以优先级 -100 运行,但我认为您需要 SNAT,这在 iptables 中相当于优先级 (+)100。我希望这有帮助。
