我有一个 VPC 10.0.0.0/16
,它带有一个 Internet 网关和两个子网10.0.100.0/24
,并且10.0.200.0/24
位于同一个可用区中。一个安全组允许tcp/22
入站0.0.0.0/0
和所有出站。我还有两个与安全组绑定的网络接口,每个子网中一个。每个网络接口都有自己的弹性 IP 地址。每个子网都有一个指向0.0.0.0/0
Internet 网关的路由表。
这是我面临的问题:我有一个与两个网络接口配对的 EC2 实例,但只能通过配对的网络接口从 Internet 进行 SSH eth0
。
这是配置:
$ uname -a
Linux ip-10-0-100-70 5.4.0-1045-aws #47-Ubuntu SMP Tue Apr 13 07:02:25 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc fq_codel state UP group default qlen 1000
link/ether 02:88:59:6e:78:c0 brd ff:ff:ff:ff:ff:ff
inet 10.0.100.70/24 brd 10.0.100.255 scope global dynamic eth0
valid_lft 1806sec preferred_lft 1806sec
inet6 fe80::88:59ff:fe6e:78c0/64 scope link
valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc fq_codel state UP group default qlen 1000
link/ether 02:7d:b2:d4:b5:ea brd ff:ff:ff:ff:ff:ff
inet 10.0.200.118/24 brd 10.0.200.255 scope global dynamic eth1
valid_lft 1807sec preferred_lft 1807sec
inet6 fe80::7d:b2ff:fed4:b5ea/64 scope link
valid_lft forever preferred_lft forever
$ ip route
default via 10.0.100.1 dev eth0 proto dhcp src 10.0.100.70 metric 100
default via 10.0.200.1 dev eth1 proto dhcp src 10.0.200.118 metric 200
10.0.200.0/24 dev eth1 proto kernel scope link src 10.0.200.118
10.0.200.1 dev eth1 proto dhcp scope link src 10.0.200.118 metric 200
10.0.100.0/24 dev eth0 proto kernel scope link src 10.0.100.70
10.0.100.1 dev eth0 proto dhcp scope link src 10.0.100.70 metric 100
$ ip rule
0: from all lookup local
32766: from all lookup main
32767: from all lookup default
$ sudo ufw status verbose
Status: inactive
$ ss -nlput
Netid State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
udp UNCONN 0 0 127.0.0.53%lo:53 0.0.0.0:*
udp UNCONN 0 0 10.0.200.118%eth1:68 0.0.0.0:*
udp UNCONN 0 0 10.0.100.70%eth0:68 0.0.0.0:*
tcp LISTEN 0 4096 127.0.0.53%lo:53 0.0.0.0:*
tcp LISTEN 0 128 0.0.0.0:22 0.0.0.0:*
tcp LISTEN 0 128 [::]:22 [::]:*
因此,看起来网络接口配置正确,并且 SSH 守护程序正在监听所有接口。SSH 守护程序运行正常,因为我已经通过 SSH 连接到eth0
。而且两个接口似乎都可以很好地将流量传送到 Internet:
$ ping -I eth0 -c 5 1.1.1.1
PING 1.1.1.1 (1.1.1.1) from 10.0.100.70 eth0: 56(84) bytes of data.
64 bytes from 1.1.1.1: icmp_seq=1 ttl=38 time=11.5 ms
64 bytes from 1.1.1.1: icmp_seq=2 ttl=38 time=11.5 ms
64 bytes from 1.1.1.1: icmp_seq=3 ttl=38 time=11.5 ms
64 bytes from 1.1.1.1: icmp_seq=4 ttl=38 time=11.6 ms
64 bytes from 1.1.1.1: icmp_seq=5 ttl=38 time=11.5 ms
--- 1.1.1.1 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4007ms
rtt min/avg/max/mdev = 11.470/11.515/11.567/0.031 ms
$ ping -I eth1 -c 5 1.1.1.1
PING 1.1.1.1 (1.1.1.1) from 10.0.200.118 eth1: 56(84) bytes of data.
64 bytes from 1.1.1.1: icmp_seq=1 ttl=38 time=11.7 ms
64 bytes from 1.1.1.1: icmp_seq=2 ttl=38 time=11.8 ms
64 bytes from 1.1.1.1: icmp_seq=3 ttl=38 time=11.7 ms
64 bytes from 1.1.1.1: icmp_seq=4 ttl=38 time=11.8 ms
64 bytes from 1.1.1.1: icmp_seq=5 ttl=38 time=11.8 ms
--- 1.1.1.1 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4008ms
rtt min/avg/max/mdev = 11.705/11.755/11.829/0.042 ms
但无法连接:
$ ssh -i ~/.ssh/cert.pem ubuntu@3.<redacted> # EIP for eth0
Welcome to Ubuntu 20.04.2 LTS (GNU/Linux 5.4.0-1045-aws x86_64)
...
$ ssh -i ~/.ssh/cert.pem ubuntu@52.<redacted> # EIP for eth1
ssh: connect to host 52.<redacted> port 22: Operation timed out
我eth1
在 AWS 的网络适配器上设置了 Netflow,并且可以看到流量:
2 840416055907 eni-07cc18b6f1b89378e <redacted> 10.0.200.118 54268 22 6 9 576 1623280724 1623280761 ACCEPT OK
所以我觉得操作系统正在丢弃流量。但是没有配置防火墙,SSH 守护程序正在监听所有接口。我还尝试跟踪日志 /var/log/{syslog,auth.log,kern.log} 和 dmesg,但当我尝试连接时,它们中没有出现任何内容。
我希望我遗漏了一些简单的东西,因为我现在有点不知所措。任何帮助都将不胜感激!
答案1
需要配置来自以下来源的流量eth1
才能正确路由:
$ sudo ip rule add from 10.0.200.118 table default
$ sudo ip route add default via 10.0.200.1 dev eth1 table default
$ sudo ip route flush cache
由于我们不想手动通过 SSH 进入实例来输入这些命令,因此我创建了一个在启动时执行的 systemd 脚本。
/home/ubuntu/dual-home.sh
#!/bin/bash
ADDR=$(ip -f inet addr show eth1 | sed -En -e 's/.*inet ([0-9.]+).*/\1/p')
GATEWAY=$(echo $ADDR | sed -En -e 's/(([0-9]+\.){3}).*/\11/p') # assumes /24 mask
sudo ip rule add from $ADDR table default
sudo ip route add default via $GATEWAY dev eth1 table default
sudo ip route flush cache
/etc/systemd/system/dual-home.service
[Unit]
Description=Configure eth1 routing
After=network.target
After=cloud-final.service
[Service]
Type=simple
ExecStart=/bin/bash /home/ubuntu/dual-home.sh
[Install]
WantedBy=cloud-init.target
然后启用服务:$ sudo systemctl enable dual-home