为什么 arp_ignore=1 会破坏 POINTOPOINT 接口上的 ARP？（kvm 客户机）

Question 1

我已经在没有任何网桥、KVM 或 Hetzner 的 Ubuntu 18.04 上验证了此行为，我认为这实际上是与处理arp_ignore点对点以太网接口有关的内核错误。验证步骤：

确保netplan完全禁用它不会造成干扰。
设置两个系统，使以太网接口相互连接，并按如下方式分配 IP 地址：

服务器A：

ip addr add 192.168.100.1/32 peer 192.168.100.2 dev ens33

ip link set ens33 up

服务器B：

ip addr add 192.168.100.2/32 peer 192.168.100.1 dev ens33

ip link set ens33 up
观察从服务器 A 到服务器 B 的正常 ping 以及ip addr show包含以下行的输出：

inet 192.168.100.1 peer 192.168.100.2/32 scope global ens33
arp_ignore=1在服务器 A 上启用sysctl net.ipv4.conf.all.arp_ignore=1，并观察 ARP 条目超时后 ping 是否停止。在一定短时间内 ping将恢复一段时间后又结束。这个过程将以这种方式无限期地持续下去。
在服务器 B 上运行tcpdump，观察来自who-has服务器 B 的对服务器 A IP 地址的入站 ping 和无响应的出站请求。有时，服务器 A 会发出who-has对服务器 B IP 的请求，并会得到回复，而服务器 B 会暂时缓存从此 ARP 请求中获得的源 MAC 地址，此时 ping 会恢复。

现在，事情是这样的。点对点和标准广播接口在如何iproute2（以及类似工具）填充in_ifaddr中定义的结构方面存在内核差异include/linux/inetdevice.h。

对于普通接口，其ifa_address字段填充为本地接口地址；对于点对点接口，ifa_address填充远程对等地址并ifa_local填充本地接口地址。

这是正常且预期的行为，因为在过去，任何点对点接口都被视为连接的远程设备本身；与其对应的本地端点地址甚至可能不存在。所以这里没有什么问题。

真正有问题的是，arp_ignore()定义的处理程序net/ipv4/arp.c间接使用了confirm_addr_indev()定义的函数，net/ipv4/devinet.c该函数迭代尝试将 ARP 请求的目标 IP 地址（即本地接口地址）与ifa_address接口上所有配置的 IP 地址记录的字段进行匹配。

这在标准接口上运行良好，因为它们确实在字段中记录了其本地 IP 地址ifa_address，但是对于点对点接口，它会失败，因为它们在此字段中记录了对等 IP 地址。

现在，真正的问题是这个问题是否真的需要修复，因为已经有人指出arp_ignore在 p2p 链接上使用是没有意义的。我认为需要，因为可能存在这样的情况，即相关主机有多个接口（并且这些接口不是 p2p 接口），并且有人可能conf.all只使用前缀启用此功能，结果发现他的 p2p 链接毫无原因地关闭了。

Answer

我已经在没有任何网桥、KVM 或 Hetzner 的 Ubuntu 18.04 上验证了此行为，我认为这实际上是与处理arp_ignore点对点以太网接口有关的内核错误。验证步骤：

确保netplan完全禁用它不会造成干扰。
设置两个系统，使以太网接口相互连接，并按如下方式分配 IP 地址：

服务器A：

ip addr add 192.168.100.1/32 peer 192.168.100.2 dev ens33

ip link set ens33 up

服务器B：

ip addr add 192.168.100.2/32 peer 192.168.100.1 dev ens33

ip link set ens33 up
观察从服务器 A 到服务器 B 的正常 ping 以及ip addr show包含以下行的输出：

inet 192.168.100.1 peer 192.168.100.2/32 scope global ens33
arp_ignore=1在服务器 A 上启用sysctl net.ipv4.conf.all.arp_ignore=1，并观察 ARP 条目超时后 ping 是否停止。在一定短时间内 ping将恢复一段时间后又结束。这个过程将以这种方式无限期地持续下去。
在服务器 B 上运行tcpdump，观察来自who-has服务器 B 的对服务器 A IP 地址的入站 ping 和无响应的出站请求。有时，服务器 A 会发出who-has对服务器 B IP 的请求，并会得到回复，而服务器 B 会暂时缓存从此 ARP 请求中获得的源 MAC 地址，此时 ping 会恢复。

现在，事情是这样的。点对点和标准广播接口在如何iproute2（以及类似工具）填充in_ifaddr中定义的结构方面存在内核差异include/linux/inetdevice.h。

对于普通接口，其ifa_address字段填充为本地接口地址；对于点对点接口，ifa_address填充远程对等地址并ifa_local填充本地接口地址。

这是正常且预期的行为，因为在过去，任何点对点接口都被视为连接的远程设备本身；与其对应的本地端点地址甚至可能不存在。所以这里没有什么问题。

真正有问题的是，arp_ignore()定义的处理程序net/ipv4/arp.c间接使用了confirm_addr_indev()定义的函数，net/ipv4/devinet.c该函数迭代尝试将 ARP 请求的目标 IP 地址（即本地接口地址）与ifa_address接口上所有配置的 IP 地址记录的字段进行匹配。

这在标准接口上运行良好，因为它们确实在字段中记录了其本地 IP 地址ifa_address，但是对于点对点接口，它会失败，因为它们在此字段中记录了对等 IP 地址。

现在，真正的问题是这个问题是否真的需要修复，因为已经有人指出arp_ignore在 p2p 链接上使用是没有意义的。我认为需要，因为可能存在这样的情况，即相关主机有多个接口（并且这些接口不是 p2p 接口），并且有人可能conf.all只使用前缀启用此功能，结果发现他的 p2p 链接毫无原因地关闭了。

Question 2

答案就在定义中： arp_ignore - 0 - (default): reply for any local target IP address, configured on any interface 1 - reply only if the target IP address is local address configured on the incoming interface 2 - reply only if the target IP address is local address configured on the incoming interface and both with the sender's IP address are part from same subnet on this interface 3 - do not reply for local addresses configured with scope host, only resolutions for global and link addresses are replied 4-7 - reserved 8 - do not reply for all local addresses

本质上，由于点对点接口本身没有分配 IP 地址，因此不存在“传入接口上的目标 IP”。IP 被分配给桥接接口……而不是实际的链路接口。因此，不会处理任何 ARP。

Answer

答案就在定义中： arp_ignore - 0 - (default): reply for any local target IP address, configured on any interface 1 - reply only if the target IP address is local address configured on the incoming interface 2 - reply only if the target IP address is local address configured on the incoming interface and both with the sender's IP address are part from same subnet on this interface 3 - do not reply for local addresses configured with scope host, only resolutions for global and link addresses are replied 4-7 - reserved 8 - do not reply for all local addresses

本质上，由于点对点接口本身没有分配 IP 地址，因此不存在“传入接口上的目标 IP”。IP 被分配给桥接接口……而不是实际的链路接口。因此，不会处理任何 ARP。

为什么 arp_ignore=1 会破坏 POINTOPOINT 接口上的 ARP？（kvm 客户机）

答案1

答案2

相关内容