使用 ipvlan l3 时,无法使用 iptables 过滤数据包

使用 ipvlan l3 时,无法使用 iptables 过滤数据包

我的总体意图是在单独的网络命名空间中运行虚拟机管理程序实例。我有一些看起来很有希望的东西,但是当我深入尝试让事情与 iptables 一起工作时,我发现一些奇怪的行为阻止我做正常的状态防火墙事情。

设置主机环境有点像这样:

# Set up packet forwarding on physical interface.
sysctl -w net.ipv6.conf.enp0s31f6.forwarding=1

# Set up ipvlan in namespace
ip netns add vm0
ip link add ipvlan0 link enp0s31f6 type ipvlan mode l3
ip link set dev ipvlan0 netns vm0
ip netns exec vm0 ip link set ipvlan0 up
ip netns exec vm0 ip link set lo up
# note: physical interface has addr 2a01:4f9:2b:35a::2/64
ip netns exec vm0 ip addr add dev ipvlan0 2a01:4f9:2b:35a::3/128

# Set up tap
ip netns exec vm0 ip tuntap add dev tap0 mode tap user fdr
# set MAC just for convenience to keep link local addresses the same between attempts
ip netns exec vm0 ip link set address ca:9d:46:13:9b:64 dev tap0
ip netns exec vm0 ip link set dev tap0 up

在连接到 Tap 的 vm0 中启动虚拟机。运行这些命令来设置其地址和路由:

export HOST_TAP_LLADDR=fe80::c89d:46ff:fe13:9b64
sudo ip addr add 2a01:4f9:2b:35a::3/128 dev ens4
sudo ip link set ens4 up
sudo ip route add default via $HOST_TAP_LLADDR dev ens4

好的,现在,如果您开始从虚拟机执行 ping 操作并附加,tcpdump -lenvi tap0您可以在 上观看 ICMP tap0,但它不会超出预期,因为在命名空间内,我们没有执行任何操作来获取来自 的tap0流量ipvlan0

# tcpdump -lenvi tap0
tcpdump: listening on tap0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
20:32:02.332080 IP6 (flowlabel 0x60b7e, hlim 64, next-header ICMPv6 (58) payload length: 64) 2a01:4f9:2b:35a::3 > 2a00:1450:400f:80c::200e: [icmp6 sum ok] ICMP6, echo request, id 2, seq 527
20:32:03.356084 IP6 (flowlabel 0x60b7e, hlim 64, next-header ICMPv6 (58) payload length: 64) 2a01:4f9:2b:35a::3 > 2a00:1450:400f:80c::200e: [icmp6 sum ok] ICMP6, echo request, id 2, seq 528
20:32:04.380079 IP6 (flowlabel 0x60b7e, hlim 64, next-header ICMPv6 (58) payload length: 64) 2a01:4f9:2b:35a::3 > 2a00:1450:400f:80c::200e: [icmp6 sum ok] ICMP6, echo request, id 2, seq 529
20:32:05.404096 IP6 (flowlabel 0x60b7e, hlim 64, next-header ICMPv6 (58) payload length: 64) 2a01:4f9:2b:35a::3 > 2a00:1450:400f:80c::200e: [icmp6 sum ok] ICMP6, echo request, id 2, seq 530
^C
#  tcpdump -lenvi ipvlan0
tcpdump: listening on ipvlan0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
<nothing>
^C

好吧,让我们做点什么吧。从中取出数据包tap0并将其放入其中ipvlan0更容易,您可以使用路由:

# Note: I can't seem to get routing to work by enabling forwarding on
# individual interfaces in the namespace, which is weird.
sysctl -w net.ipv6.conf.all.forwarding=1
ip route add default via fe80::1 dev ipvlan0 proto static

我们可以使用 tcpdump 确认这一点:我们只会看到tap0发送 ICMP,并且ipvlan0都发送这些数据包并接收 ICMP 响应:

root@Ubuntu-2204-jammy-amd64-base ~ #  tcpdump -lenvi ipvlan0
tcpdump: listening on ipvlan0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
20:42:08.542989 4c:52:62:0e:05:d3 > 4c:52:62:0e:05:d3, ethertype IPv6 (0x86dd), length 118: (flowlabel 0x60b7e, hlim 63, next-header ICMPv6 (58) payload length: 64) 2a01:4f9:2b:35a::3 > 2a00:1450:400f:80c::200e: [icmp6 sum ok] ICMP6, echo request, id 2, seq 1119
20:42:08.549416 d0:07:ca:8d:19:31 > 4c:52:62:0e:05:d3, ethertype IPv6 (0x86dd), length 118: (flowlabel 0x60b7e, hlim 119, next-header ICMPv6 (58) payload length: 64) 2a00:1450:400f:80c::200e > 2a01:4f9:2b:35a::3: [icmp6 sum ok] ICMP6, echo reply, id 2, seq 1119
20:42:09.567009 4c:52:62:0e:05:d3 > 4c:52:62:0e:05:d3, ethertype IPv6 (0x86dd), length 118: (flowlabel 0x60b7e, hlim 63, next-header ICMPv6 (58) payload length: 64) 2a01:4f9:2b:35a::3 > 2a00:1450:400f:80c::200e: [icmp6 sum ok] ICMP6, echo request, id 2, seq 1120
20:42:09.573468 d0:07:ca:8d:19:31 > 4c:52:62:0e:05:d3, ethertype IPv6 (0x86dd), length 118: (flowlabel 0x60b7e, hlim 119, next-header ICMPv6 (58) payload length: 64) 2a00:1450:400f:80c::200e > 2a01:4f9:2b:35a::3: [icmp6 sum ok] ICMP6, echo reply, id 2, seq 1120
20:42:10.590995 4c:52:62:0e:05:d3 > 4c:52:62:0e:05:d3, ethertype IPv6 (0x86dd), length 118: (flowlabel 0x60b7e, hlim 63, next-header ICMPv6 (58) payload length: 64) 2a01:4f9:2b:35a::3 > 2a00:1450:400f:80c::200e: [icmp6 sum ok] ICMP6, echo request, id 2, seq 1121
20:42:10.597439 d0:07:ca:8d:19:31 > 4c:52:62:0e:05:d3, ethertype IPv6 (0x86dd), length 118: (flowlabel 0x60b7e, hlim 119, next-header ICMPv6 (58) payload length: 64) 2a00:1450:400f:80c::200e > 2a01:4f9:2b:35a::3: [icmp6 sum ok] ICMP6, echo reply, id 2, seq 1121
^C

root@Ubuntu-2204-jammy-amd64-base ~ # tcpdump -lenvi tap0
tcpdump: listening on tap0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
20:42:14.686993 2e:db:57:5c:90:28 > ca:9d:46:13:9b:64, ethertype IPv6 (0x86dd), length 118: (flowlabel 0x60b7e, hlim 64, next-header ICMPv6 (58) payload length: 64) 2a01:4f9:2b:35a::3 > 2a00:1450:400f:80c::200e: [icmp6 sum ok] ICMP6, echo request, id 2, seq 1125
20:42:15.710995 2e:db:57:5c:90:28 > ca:9d:46:13:9b:64, ethertype IPv6 (0x86dd), length 118: (flowlabel 0x60b7e, hlim 64, next-header ICMPv6 (58) payload length: 64) 2a01:4f9:2b:35a::3 > 2a00:1450:400f:80c::200e: [icmp6 sum ok] ICMP6, echo request, id 2, seq 1126
20:42:16.735027 2e:db:57:5c:90:28 > ca:9d:46:13:9b:64, ethertype IPv6 (0x86dd), length 118: (flowlabel 0x60b7e, hlim 64, next-header ICMPv6 (58) payload length: 64) 2a01:4f9:2b:35a::3 > 2a00:1450:400f:80c::200e: [icmp6 sum ok] ICMP6, echo request, id 2, seq 1127
^C

但现在我们陷入困境:如何从ipvlan0到获取数据包tap0?我发现成功tc镜像了数据包,并重写了它们的目标 MAC 地址:

# Move frames between tap and ipvlan, changing MAC addresses so the
# receiving side knows the frame is for them, and filtering on IP
# addresses in event of oddly addressed traffic coming in or going
# out.
tc qdisc add dev ipvlan0 handle ffff: ingress
export VM_MAC=2e:db:57:5c:90:28
tc filter add dev ipvlan0 parent ffff: protocol ipv6 \
  u32 match ip6 dst 2a01:4f9:2b:35a::3 \
  action skbmod set dmac $VM_MAC \
  action mirred egress redirect dev tap0

最后,我们的工作得到了回报ping,我们可以回顾 tcpdump tap0

tcpdump -lenvi tap0
tcpdump: listening on tap0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
20:46:09.471859 2e:db:57:5c:90:28 > ca:9d:46:13:9b:64, ethertype IPv6 (0x86dd), length 118: (flowlabel 0x60b7e, hlim 64, next-header ICMPv6 (58) payload length: 64) 2a01:4f9:2b:35a::3 > 2a00:1450:400f:80c::200e: [icmp6 sum ok] ICMP6, echo request, id 3, seq 5
20:46:09.478344 d0:07:ca:8d:19:31 > 2e:db:57:5c:90:28, ethertype IPv6 (0x86dd), length 118: (flowlabel 0x60b7e, hlim 119, next-header ICMPv6 (58) payload length: 64) 2a00:1450:400f:80c::200e > 2a01:4f9:2b:35a::3: [icmp6 sum ok] ICMP6, echo reply, id 3, seq 5
20:46:10.473455 2e:db:57:5c:90:28 > ca:9d:46:13:9b:64, ethertype IPv6 (0x86dd), length 118: (flowlabel 0x60b7e, hlim 64, next-header ICMPv6 (58) payload length: 64) 2a01:4f9:2b:35a::3 > 2a00:1450:400f:80c::200e: [icmp6 sum ok] ICMP6, echo request, id 3, seq 6
20:46:10.479936 d0:07:ca:8d:19:31 > 2e:db:57:5c:90:28, ethertype IPv6 (0x86dd), length 118: (flowlabel 0x60b7e, hlim 119, next-header ICMPv6 (58) payload length: 64) 2a00:1450:400f:80c::200e > 2a01:4f9:2b:35a::3: [icmp6 sum ok] ICMP6, echo reply, id 3, seq 6

只要我们引入了tc,我们就可以通过摆脱路由并在另一个方向(从tap0到 )做同样的事情来简化整个事情ipvlan0

# Undo routing stuff
sysctl -w net.ipv6.conf.all.forwarding=0
ip route del default via fe80::1 dev ipvlan0

# Use tc much the same, but in reverse:
tc qdisc add dev tap0 handle ffff: ingress
# ipvlan shares host MAC with the physical interface
tc filter add dev tap0 parent ffff: protocol ipv6 \
  u32 match ip6 src 2a01:4f9:2b:35a::3 \
  action skbmod set dmac 4c:52:62:0e:05:d3 \
  action mirred egress redirect dev ipvlan0

数据包再次流动:

# tcpdump -lenvi tap0
tcpdump: listening on tap0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
20:52:25.641314 2e:db:57:5c:90:28 > ca:9d:46:13:9b:64, ethertype IPv6 (0x86dd), length 118: (flowlabel 0x60b7e, hlim 64, next-header ICMPv6 (58) payload length: 64) 2a01:4f9:2b:35a::3 > 2a00:1450:400f:80c::200e: [icmp6 sum ok] ICMP6, echo request, id 3, seq 377
20:52:25.647845 d0:07:ca:8d:19:31 > 2e:db:57:5c:90:28, ethertype IPv6 (0x86dd), length 118: (flowlabel 0x60b7e, hlim 119, next-header ICMPv6 (58) payload length: 64) 2a00:1450:400f:80c::200e > 2a01:4f9:2b:35a::3: [icmp6 sum ok] ICMP6, echo reply, id 3, seq 377
^C

现在,此时我以为一切正常,却发现 iptables 的行为非常奇怪。例如,启用日志记录:

ip6tables -I INPUT 1 -j LOG
ip6tables -I FORWARD 1 -j LOG
ip6tables -I OUTPUT 1 -j LOG

如果我直接从主机命名空间执行 ping,我会得到一条OUT记录,然后会IN按预期得到一条记录:

Feb 09 20:54:37 Ubuntu-2204-jammy-amd64-base kernel: IN= OUT=enp0s31f6 SRC=2a01:04f9:002b:035a:0000:0000:0000:0002 DST=2a00:1450:400f:080c:0000:0000:0000:200e LEN=104 TC=0 HOPLIMIT=64 FLOWLBL=471262 PROTO=ICMPv6 TYPE=128 CODE=0 ID=3 SEQ=1 
Feb 09 20:54:37 Ubuntu-2204-jammy-amd64-base kernel: IN=enp0s31f6 OUT= MAC=4c:52:62:0e:05:d3:d0:07:ca:8d:19:31:86:dd SRC=2a00:1450:400f:080c:0000:0000:0000:200e DST=2a01:04f9:002b:035a:0000:0000:0000:0002 LEN=104 TC=0 HOPLIMIT=60 FLOWLBL=471262 PROTO=ICMPv6 TYPE=129 CODE=0 ID=3 SEQ=1 

但是,即使流量正在流动,我也只能获取OUTVM 流量的记录,即使我们从 tcpdump 知道它也在接收数据包并且 ping 成功:

Feb 09 20:58:53 Ubuntu-2204-jammy-amd64-base kernel: IN= OUT=enp0s31f6 SRC=2a01:04f9:002b:035a:0000:0000:0000:0003 DST=2a00:1450:400f:080c:0000:0000:0000:200e LEN=104 TC=0 HOPLIMIT=64 FLOWLBL=396158 PROTO=ICMPv6 TYPE=128 CODE=0 ID=3 SEQ=764 
Feb 09 20:58:54 Ubuntu-2204-jammy-amd64-base kernel: IN= OUT=enp0s31f6 SRC=2a01:04f9:002b:035a:0000:0000:0000:0003 DST=2a00:1450:400f:080c:0000:0000:0000:200e LEN=104 TC=0 HOPLIMIT=64 FLOWLBL=396158 PROTO=ICMPv6 TYPE=128 CODE=0 ID=3 SEQ=765 
Feb 09 20:58:55 Ubuntu-2204-jammy-amd64-base kernel: IN= OUT=enp0s31f6 SRC=2a01:04f9:002b:035a:0000:0000:0000:0003 DST=2a00:1450:400f:080c:0000:0000:0000:200e LEN=104 TC=0 HOPLIMIT=64 FLOWLBL=396158 PROTO=ICMPv6 TYPE=128 CODE=0 ID=3 SEQ=766 

此外,我尝试让命名空间vm0通过 iptables 记录任何内容,但没有成功,只有主机生成此日志,并且缺少IN条目。这意味着我似乎不存在进行状态防火墙的能力。

那么,这是怎么回事?我是不是找错了树?

相关内容