我有一个在 k8s 集群上运行的 EC2 实例。该实例拥有三块三卡。
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000
link/ether 02:9c:bb:70:ee:be brd ff:ff:ff:ff:ff:ff
inet 10.0.2.206/24 brd 10.0.2.255 scope global dynamic eth0
valid_lft 2821sec preferred_lft 2821sec
---
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 02:7a:d4:ac:64:b4 brd ff:ff:ff:ff:ff:ff
inet6 fe80::7a:d4ff:feac:64b4/64 scope link
valid_lft forever preferred_lft forever
9: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000
link/ether 02:54:4e:b4:b4:e8 brd ff:ff:ff:ff:ff:ff
inet 10.0.6.4/24 brd 10.0.6.255 scope global dynamic eth2
valid_lft 2863sec preferred_lft 2863sec
inet6 fe80::54:4eff:feb4:b4e8/64 scope link
valid_lft forever preferred_lft forever
请忽略 eth1,因为尚未正确配置。
当我ping -I 10.0.6.4 www.google.com
从 eth2 ping () 到互联网时,我注意到 linux 选择了主接口的 IP
[ec2-user@ip-10-0-2-206 ~]$ sudo tcpdump -i eth2 -n
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth2, link-type EN10MB (Ethernet), capture size 262144 bytes
14:19:08.968772 IP 10.0.2.206 > 142.251.42.36: ICMP echo request, id 17679, seq 9, length 64
14:19:09.992871 IP 10.0.2.206 > 142.251.42.36: ICMP echo request, id 17679, seq 10, length 64
14:19:11.016767 IP 10.0.2.206 > 142.251.42.36: ICMP echo request, id 17679, seq 11, length 64
正如您所看到的,源 IP 被错误地选择为10.0.2.206
而不是10.0.6.4
。
[ec2-user@ip-10-0-2-206 ~]$ ip rule
0: from all lookup local
511: from 10.0.6.4 lookup 10002
512: from all to 10.0.2.4 lookup main
512: from all to 10.0.2.185 lookup main
512: from all to 10.0.2.233 lookup main
1024: from all fwmark 0x80/0x80 lookup main
32766: from all lookup main
32767: from all lookup default
[ec2-user@ip-10-0-2-206 ~]$ sudo ip route show table 10002
default via 10.0.6.1 dev eth2 src 10.0.6.4
10.0.6.0/24 dev eth2 proto kernel scope link src 10.0.6.4
[ec2-user@ip-10-0-2-206 ~]$ sudo ip route show table main
default via 10.0.2.1 dev eth0
default via 10.0.6.1 dev eth2 metric 10002
10.0.2.0/24 dev eth0 proto kernel scope link src 10.0.2.206
10.0.2.4 dev eni379b345d249 scope link
10.0.2.185 dev eni0231b3becdb scope link
10.0.2.233 dev eni5be9f7773c7 scope link
10.0.6.0/24 dev eth2 proto kernel scope link src 10.0.6.4
169.254.169.254 dev eth0
[ec2-user@ip-10-0-2-206 ~]$ ip route get 172.217.174.68 from 10.0.6.4
172.217.174.68 from 10.0.6.4 via 10.0.6.1 dev eth2 table 10002 uid 1000
cache
所有配置似乎都是正确的,但源地址仍然是从主接口获取的。我缺少什么?
答案1
唯一一个我没看过的地方小心是iptables。我在其中找到了以下条目。
Chain AWS-SNAT-CHAIN-0 (1 references)
target prot opt source destination
AWS-SNAT-CHAIN-1 all -- anywhere !ip-10-0-0-0.xxxx.compute.internal/16 /* AWS SNAT CHAIN */
Chain AWS-SNAT-CHAIN-1 (1 references)
target prot opt source destination
SNAT all -- anywhere anywhere /* AWS, SNAT */ ADDRTYPE match dst-type !LOCAL to:10.0.2.206 random-fully
进一步搜索,我找到了有关该主题的 AWS 文档:https://docs.aws.amazon.com/eks/latest/userguide/external-snat.html
根据我执行命令时的文档kubectl set env daemonset -n kube-system aws-node AWS_VPC_K8S_CNI_EXTERNALSNAT=true
,这些SNAT
条目已从虚拟机中删除。
完成后,来自 eth2 的 ping 显示正确的源 IP - 10.0.6.4
。