L3-DSR 真实服务器 iptables/conntrack 配置

L3-DSR 真实服务器 iptables/conntrack 配置

我正在使用自定义 LB 进行 L3-DSR 设置。自定义 LB 使用 DSCP 设置进行 DNAT,尽管 DSCP 设置与这个问题无关。我创建了一个包含一些网络命名空间的实验室

VIP: 10.10.10.1
Real-IP: 10.10.2.2 10.10.2.3

client netns ---> router netns ---> server netns

我想设置真实服务器来回复 VIP 源地址。

  • 当我从clientVIP 进行 ping 时,
  • router执行 DNAT 并server获取 DNAT 数据包。
  • 但是,SNAT 规则未被处理,server并且clientping 从不同的地址获得答复。iptables 显示规则未得到满足。
# tcpdump -i veth0 -n (on server)
listening on veth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
12:08:48.353964 IP 10.10.1.2 > 10.10.2.3: ICMP echo request, id 48129, seq 1870, length 64
12:08:48.354002 IP 10.10.2.3 > 10.10.1.2: ICMP echo reply, id 48129, seq 1870, length 64

# ping 10.10.10.1 (from client)
64 bytes from 10.10.2.3: icmp_seq=1818 ttl=63 time=12.3 ms (DIFFERENT ADDRESS!)
64 bytes from 10.10.2.3: icmp_seq=1819 ttl=63 time=2.87 ms (DIFFERENT ADDRESS!)

使用 iptables TRACE 目标,我看不到任何数据包进入 nat 表。conntrack 是否处理返回目标并阻止数据包进入 POSTROUTING nat 表?

  • 据我所知,除了 nat 之外,没有其他表不能进行 NAT 操作。
  • 那么,对于这种 DSR 情况,我该如何仅对回复数据包执行 SNAT 呢?

任何评论都将不胜感激。

# ip netns exec server iptables -t raw -nvL
Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
 2346  197K TRACE      all  --  *      *       0.0.0.0/0            0.0.0.0/0

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
 1010 84840 TRACE      all  --  *      *       0.0.0.0/0            0.0.0.0/0

# ip netns exec server iptables -t nat -nvL
Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
    0     0 SNAT       all  --  *      *       10.10.2.2            0.0.0.0/0            to:10.10.10.1
    0     0 SNAT       all  --  *      *       10.10.2.3            0.0.0.0/0            to:10.10.10.1
# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
15: veth0@if14: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 56:7e:d2:fd:07:b6 brd ff:ff:ff:ff:ff:ff link-netns router
    inet 10.10.2.2/24 scope global veth0
       valid_lft forever preferred_lft forever
    inet 10.10.2.3/32 scope global veth0
       valid_lft forever preferred_lft forever
    inet 10.10.10.1/32 scope global veth0
       valid_lft forever preferred_lft forever
    inet6 fe80::547e:d2ff:fefd:7b6/64 scope link
       valid_lft forever preferred_lft forever
# ip netns exec server xtables-monitor --trace
PACKET: 2 549621b3 IN=veth0 MACSRC=2:fe:78:69:f4:cd MACDST=56:7e:d2:fd:7:b6 MACPROTO=0800 SRC=10.10.1.2 DST=10.10.2.3 LEN=84 TOS=0x4 TTL=63 ID=22449DF
 TRACE: 2 549621b3 raw:PREROUTING:rule:0x5:CONTINUE  -4 -t raw -A PREROUTING -j TRACE
 TRACE: 2 549621b3 raw:PREROUTING:return:
 TRACE: 2 549621b3 raw:PREROUTING:policy:ACCEPT
 TRACE: 2 549621b3 filter:INPUT:return:
 TRACE: 2 549621b3 filter:INPUT:policy:ACCEPT
PACKET: 2 dc162502 OUT=veth0 SRC=10.10.2.3 DST=10.10.1.2 LEN=84 TOS=0x4 TTL=64 ID=2152
 TRACE: 2 dc162502 raw:OUTPUT:rule:0x7:CONTINUE  -4 -t raw -A OUTPUT -j TRACE
 TRACE: 2 dc162502 raw:OUTPUT:return:
 TRACE: 2 dc162502 raw:OUTPUT:policy:ACCEPT
 TRACE: 2 dc162502 filter:OUTPUT:return:
 TRACE: 2 dc162502 filter:OUTPUT:policy:ACCEPT

答案1

由于 nat 表是从 conntrack NEW 状态进入的,因此我们应该对传出的回复进行 SNAT 的 DSR 用例无法单独满足 iptables 框架。

这种情况下,我们需要用tc.consult来做无状态nat man tc-nat

tc qdisc add dev veth0 root handle 10: htb
tc filter add dev veth0 parent 10: protocol ip prio 10 u32 match ip src 10.10.2.2 action nat egress 10.10.2.2 10.10.10.1

or
 
iptables -t mangle -A PREROUTING -m dscp --dscp 1 -j CONNMARK --set-mark 1
iptables -t mangle -A POSTROUTING -j CONNMARK --restore-mark
tc qdisc add dev veth0 root handle 10: htb
tc filter add dev veth0 parent 10: protocol ip prio 10 handle 1 fw flowid 1:1 action nat egress 10.10.2.2 10.10.10.1

相关内容