配置/拓扑:
有 3 台机器
hadoop2 | hadoop | driver
eth0 10.10.15.3 | eth0 10.10.15.2 | tap0 192.168.0.199
route default to 10.10.15.1 | tap0 192.168.0.195 | route 10.10.15.0/24 to 192.168.0.195
route 192.168.0.0/24 to hadoop | route default 10.10.15.1 |
no iptables rules | route 192.168.0.0 tap0 |
| no iptables rules |
| ip_forward = 1 |
来自 hadoop2 的路线:
root@hadoop2:~# netstat -rn
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
0.0.0.0 10.10.15.1 0.0.0.0 UG 0 0 0 eth0
10.10.15.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
192.168.0.0 10.10.15.2 255.255.255.0 UG 0 0 0 eth0
来自 hadoop 的路线:
root@hadoop:~# netstat -rn
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
0.0.0.0 10.10.15.1 0.0.0.0 UG 0 0 0 eth0
10.10.15.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
192.168.0.0 0.0.0.0 255.255.255.0 U 0 0 0 tap0
问题
192.168.0.199
从到 的Ping10.10.15.3
操作正常:
PING 10.10.15.3 (10.10.15.3) 56(84) bytes of data.
64 bytes from 10.10.15.3: icmp_req=1 ttl=63 time=55.9 ms
64 bytes from 10.10.15.3: icmp_req=2 ttl=63 time=55.5 ms
64 bytes from 10.10.15.3: icmp_req=3 ttl=63 time=57.8 ms
路由器上的 Tcpdump(hadoop):
root@hadoop:~# tcpdump -n icmp -i eth0
08:53:11.899079 IP 192.168.0.199 > 10.10.15.3: ICMP echo request, id 20880, seq 1, length 64
08:53:11.899789 IP 10.10.15.3 > 192.168.0.199: ICMP echo reply, id 20880, seq 1, length 64
08:53:12.900885 IP 192.168.0.199 > 10.10.15.3: ICMP echo request, id 20880, seq 2, length 64
08:53:12.901497 IP 10.10.15.3 > 192.168.0.199: ICMP echo reply, id 20880, seq 2, length 64
08:53:13.903734 IP 192.168.0.199 > 10.10.15.3: ICMP echo request, id 20880, seq 3, length 64
08:53:13.904351 IP 10.10.15.3 > 192.168.0.199: ICMP echo reply, id 20880, seq 3, length 64
但从另一侧(10.10.15.3
到 192.168.0.199
)甚至到路由器地址都不起作用,因为源地址已改变. Tcpdump 开启hadoop2
:
root@hadoop2:~# tcpdump icmp -ne -i eth0
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
10:19:48.778020 52:54:00:f2:e6:4b > 52:54:00:82:d4:ac, ethertype IPv4 (0x0800), length 98: 10.10.15.3 > 192.168.0.199: ICMP echo request, id 3409, seq 1, length 64
10:19:49.786993 52:54:00:f2:e6:4b > 52:54:00:82:d4:ac, ethertype IPv4 (0x0800), length 98: 10.10.15.3 > 192.168.0.199: ICMP echo request, id 3409, seq 2, length 64
10:19:50.794744 52:54:00:f2:e6:4b > 52:54:00:82:d4:ac, ethertype IPv4 (0x0800), length 98: 10.10.15.3 > 192.168.0.199: ICMP echo request, id 3409, seq 3, length 64
看起来不错,不是吗?但在路由器上(hadoop
):
root@hadoop:~# tcpdump -n icmp -i eth0
08:55:37.688153 IP 10.10.15.1 > 192.168.0.199: ICMP echo request, id 3382, seq 81, length 64
08:55:37.742960 IP 192.168.0.199 > 10.10.15.1: ICMP echo reply, id 3382, seq 81, length 64
08:55:38.696155 IP 10.10.15.1 > 192.168.0.199: ICMP echo request, id 3382, seq 82, length 64
08:55:38.751218 IP 192.168.0.199 > 10.10.15.1: ICMP echo reply, id 3382, seq 82, length 64
编辑附加日志证明数据包10.10.15.3
不是从以下位置发送的10.10.15.1
:
root@hadoop:~# tcpdump -i eth0 -ne icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
09:55:43.912159 52:54:00:f2:e6:4b > 52:54:00:82:d4:ac, ethertype IPv4 (0x0800), length 98: 10.10.15.1 > 192.168.0.199: ICMP echo request, id 3397, seq 1, length 64
09:55:44.033807 52:54:00:82:d4:ac > 52:54:00:80:a5:aa, ethertype IPv4 (0x0800), length 98: 192.168.0.199 > 10.10.15.1: ICMP echo reply, id 3397, seq 1, length 64
09:55:44.920389 52:54:00:f2:e6:4b > 52:54:00:82:d4:ac, ethertype IPv4 (0x0800), length 98: 10.10.15.1 > 192.168.0.199: ICMP echo request, id 3397, seq 2, length 64
09:55:44.975593 52:54:00:82:d4:ac > 52:54:00:80:a5:aa, ethertype IPv4 (0x0800), length 98: 192.168.0.199 > 10.10.15.1: ICMP echo reply, id 3397, seq 2, length 64
和ifconfig
:
root@hadoop2:~# ifconfig
eth0 Link encap:Ethernet HWaddr 52:54:00:f2:e6:4b
inet addr:10.10.15.3 Bcast:10.10.15.255 Mask:255.255.255.0
inet6 addr: fe80::5054:ff:fef2:e64b/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:16778 errors:0 dropped:0 overruns:0 frame:0
TX packets:7877 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:14829038 (14.1 MiB) TX bytes:835235 (815.6 KiB)
Interrupt:11 Base address:0x4000
arp -a -n
:
root@hadoop:~# arp -a -n
? (10.10.15.3) at 52:54:00:f2:e6:4b [ether] on eth0
? (192.168.0.199) at 32:a6:ed:93:e6:46 [ether] on tap0
? (10.10.15.1) at 52:54:00:80:a5:aa [ether] on eth0
root@hadoop2:~# arp -a -n
? (10.10.15.2) at 52:54:00:82:d4:ac [ether] on eth0
地址已更改。ip route get 192.168.0.199
:
192.168.0.199 via 10.10.15.2 dev eth0 src 10.10.15.3
cache
所以肯定没问题。我们来看看iptables
。也许有一些化装舞会?
在hadoop2
:
Chain PREROUTING (policy ACCEPT)
target prot opt source destination
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
Chain POSTROUTING (policy ACCEPT)
target prot opt source destination
不,那怎么样hadoop
?
Chain PREROUTING (policy ACCEPT)
target prot opt source destination
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
Chain POSTROUTING (policy ACCEPT)
target prot opt source destination
还有什么可能导致地址改变或其他问题?我可以在哪里使用 NAT?
虚拟化主机配置:
ifconfig
:
root@s5 ~ # ifconfig -a
eth0 Link encap:Ethernet HWaddr 6c:62:6d:a0:77:54
inet addr:46.4.56.15 Bcast:46.4.56.63 Mask:255.255.255.192
inet6 addr: 2a01:4f8:140:140e::2/64 Scope:Global
inet6 addr: fe80::6e62:6dff:fea0:7754/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:14629035 errors:0 dropped:0 overruns:0 frame:0
TX packets:13602067 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:3739676186 (3.7 GB) TX bytes:1918243832 (1.9 GB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:385337 errors:0 dropped:0 overruns:0 frame:0
TX packets:385337 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:40556871 (40.5 MB) TX bytes:40556871 (40.5 MB)
tap0 Link encap:Ethernet HWaddr b2:bd:05:99:4e:02
BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
virbr1 Link encap:Ethernet HWaddr 52:54:00:80:a5:aa
inet addr:10.10.15.1 Bcast:10.10.15.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:8297624 errors:0 dropped:0 overruns:0 frame:0
TX packets:8633037 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:494260090 (494.2 MB) TX bytes:2661285270 (2.6 GB)
virbr1-nic Link encap:Ethernet HWaddr 52:54:00:80:a5:aa
BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:500
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
vnet0 Link encap:Ethernet HWaddr fe:54:00:82:d4:ac
inet6 addr: fe80::fc54:ff:fe82:d4ac/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:6365724 errors:0 dropped:0 overruns:0 frame:0
TX packets:7812413 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:500
RX bytes:450656918 (450.6 MB) TX bytes:1363588305 (1.3 GB)
vnet1 Link encap:Ethernet HWaddr fe:54:00:f2:e6:4b
inet6 addr: fe80::fc54:ff:fef2:e64b/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:2007348 errors:0 dropped:0 overruns:0 frame:0
TX packets:3291986 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:500
RX bytes:194231862 (194.2 MB) TX bytes:192280276 (192.2 MB)
netstat1
:
root@s5 ~ # netstat -rn
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
0.0.0.0 46.4.56.1 0.0.0.0 UG 0 0 0 eth0
10.10.15.0 0.0.0.0 255.255.255.0 U 0 0 0 virbr1
46.4.56.0 46.4.56.1 255.255.255.192 UG 0 0 0 eth0
46.4.56.0 0.0.0.0 255.255.255.192 U 0 0 0 eth0
iptables
:
root@s5 ~ # iptables -L -n -v
Chain INPUT (policy ACCEPT 3050 packets, 332K bytes)
pkts bytes target prot opt in out source destination
1493K 159M fail2ban-ssh tcp -- * * 0.0.0.0/0 0.0.0.0/0 multiport dports 22
0 0 ACCEPT udp -- virbr1 * 0.0.0.0/0 0.0.0.0/0 udp dpt:53
0 0 ACCEPT tcp -- virbr1 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:53
0 0 ACCEPT udp -- virbr1 * 0.0.0.0/0 0.0.0.0/0 udp dpt:67
0 0 ACCEPT tcp -- virbr1 * 0.0.0.0/0 0.0.0.0/0 tcp dpt:67
Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
104 5112 ACCEPT tcp -- * * 0.0.0.0/0 10.10.15.2 state NEW tcp dpt:22
10M 2673M ACCEPT all -- * virbr1 0.0.0.0/0 10.10.15.0/24 ctstate RELATED,ESTABLISHED
9909K 596M ACCEPT all -- virbr1 * 10.10.15.0/24 0.0.0.0/0
124 8000 ACCEPT all -- virbr1 virbr1 0.0.0.0/0 0.0.0.0/0
0 0 REJECT all -- * virbr1 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable
0 0 REJECT all -- virbr1 * 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable
0 0 ACCEPT tcp -- * * 0.0.0.0/0 10.10.15.2 state NEW tcp dpt:22
Chain OUTPUT (policy ACCEPT 2835 packets, 625K bytes)
pkts bytes target prot opt in out source destination
0 0 ACCEPT udp -- * virbr1 0.0.0.0/0 0.0.0.0/0 udp dpt:68
Chain fail2ban-ssh (1 references)
pkts bytes target prot opt in out source destination
17 1680 REJECT all -- * * 221.229.166.28 0.0.0.0/0 reject-with icmp-port-unreachable
22 2280 REJECT all -- * * 222.186.21.133 0.0.0.0/0 reject-with icmp-port-unreachable
21 2164 REJECT all -- * * 222.186.160.51 0.0.0.0/0 reject-with icmp-port-unreachable
34 2040 REJECT all -- * * 108.31.71.51 0.0.0.0/0 reject-with icmp-port-unreachable
1300K 143M RETURN all -- * * 0.0.0.0/0 0.0.0.0/0
root@s5 ~ # iptables -t nat -L -n -v
Chain PREROUTING (policy ACCEPT 2622K packets, 156M bytes)
pkts bytes target prot opt in out source destination
91 4332 DNAT tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:2022 to:10.10.15.2:22
0 0 DNAT tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:2022 to:10.10.15.2:22
Chain INPUT (policy ACCEPT 766K packets, 45M bytes)
pkts bytes target prot opt in out source destination
Chain OUTPUT (policy ACCEPT 45990 packets, 3419K bytes)
pkts bytes target prot opt in out source destination
Chain POSTROUTING (policy ACCEPT 1753K packets, 105M bytes)
pkts bytes target prot opt in out source destination
0 0 RETURN all -- * * 10.10.15.0/24 224.0.0.0/24
0 0 RETURN all -- * * 10.10.15.0/24 255.255.255.255
1203 72564 MASQUERADE tcp -- * * 10.10.15.0/24 !10.10.15.0/24 masq ports: 1024-65535
5606 310K MASQUERADE udp -- * * 10.10.15.0/24 !10.10.15.0/24 masq ports: 1024-65535
76 6384 MASQUERADE all -- * * 10.10.15.0/24 !10.10.15.0/24
45956 3416K MASQUERADE all -- * eth0 0.0.0.0/0 0.0.0.0/0
0 0 MASQUERADE all -- * eth0 0.0.0.0/0 0.0.0.0/0
brctl show
root@s5 ~ # brctl show
bridge name bridge id STP enabled interfaces
virbr1 8000.52540080a5aa yes virbr1-nic
vnet0
vnet1
答案1
我非常怀疑问题出在 qemu 主机的这一行iptables -t nat -L -n -v
:
76 6384 MASQUERADE all -- * * 10.10.15.0/24 !10.10.15.0/24
hadoop2
这导致从到 的原始流量(即非返回一半)driver
被 NAT 到10.10.15.1
。
您可以通过从 NAT 中豁免我们感兴趣的流量来测试这个假设:
qemu-host# iptables -t nat -I POSTROUTING 1 -s 10.10.15.3 -d 192.168.0.199 -j ACCEPT
如果hadoop
看到具有正确源地址的数据包,我们就解决了问题。解决方案更加复杂 - 这取决于别的您的 qemu 主机正在运行,您必须与您的管理员合作来解决这个问题 - 但至少我们会解释目前正在发生的以前无法解释的 NAT。