当 DHCP 服务器发生故障时,2 个服务器表现出不同的行为

当 DHCP 服务器发生故障时,2 个服务器表现出不同的行为

我无法控制网络上的 DHCP 服务器,每隔两周左右就会出现几个小时的中断,我的 CentOS 7.8 服务器无法收到对 DHCP 续订请求的响应。据我所知,这些服务器的配置完全相同。一些服务器在此中断期间不断请求 DHCP,直到 DHCP 续订成功并且系统重新连接到网络。但是,一些服务器似乎遇到了一些极端情况,并在一段时间后停止 DHCP 请求,然后再也没有回到网络上。有人能告诉我,当我看到我发布的日志中的差异时,发生了什么不同吗?
server003 是一个失败的案例,
server004 是一个很好的案例,谢谢!

我看到的一些奇怪现象是在失败的 server003 上
“绑定:2134686840 秒内更新”
“尝试记录租约 192.168.2.72”
192.168.2.72 是我们曾经使用的非常古老的网络,dhclient 实际上是否在接口上设置了这个 IP?

server003日志:

Nov 18 07:01:41 got DHCP
Nov 18 07:21:02 something killed, MFE?
Nov 18 09:00:09 DHCP started failing
Nov 18 09:00:09 server003 dhclient[45214]: DHCPREQUEST on enp4s0 to 10.20.193.131 port 67 (xid=0x44d64e6c)
-- DHCPREQUEST on enp4s0 repeatedly till 12:01 --
Nov 18 12:01:27 server003 dhclient[45214]: DHCPREQUEST on enp4s0 to 255.255.255.255 port 67 (xid=0x44d64e6c)
Nov 18 12:01:41 server003 avahi-daemon[1973]: Withdrawing address record for 10.20.232.222 on enp4s0.
Nov 18 12:01:41 server003 avahi-daemon[1973]: Leaving mDNS multicast group on interface enp4s0.IPv4 with address 10.20.232.222.
Nov 18 12:01:41 server003 avahi-daemon[1973]: Interface enp4s0.IPv4 no longer relevant for mDNS.
Nov 18 12:01:42 server003 NetworkManager[2553]: <info>  [1605718902.1357] dhcp4 (enp4s0): state changed bound -> expire
Nov 18 12:01:42 server003 NetworkManager[2553]: <info>  [1605718902.1364] device (enp4s0): DHCPv4: 480 seconds grace period started
Nov 18 12:01:42 server003 NetworkManager[2553]: <info>  [1605718902.1469] dhcp4 (enp4s0): state changed expire -> unknown
Nov 18 12:02:41 server003 dhclient[45214]: DHCPDISCOVER on enp4s0 to 255.255.255.255 port 67 interval 2 (xid=0x40f44748)
Nov 18 12:02:43 server003 dhclient[45214]: No DHCPOFFERS received.
Nov 18 12:02:43 server003 dhclient[45214]: Trying recorded lease 192.168.2.72
Nov 18 12:02:43 server003 NetworkManager[2553]: <info>  [1605718963.8053] dhcp4 (enp4s0): state changed unknown -> timeout
Nov 18 12:02:43 server003 dhclient[45214]: bound: renewal in 2134686840 seconds.
Nov 18 12:09:42 server003 NetworkManager[2553]: <info>  [1605719382.2119] device (enp4s0): DHCPv4: grace period expired
Nov 18 13:06:02 server003 NetworkManager[2553]: <info>  [1605722762.3311] policy: set 'enp4s0' (enp4s0) as default for IPv6 routing and DNS
-- nothing else after this --

server004 日志

Nov 18 07:27:10 got DHCP
Nov 18 09:26:55 DHCP started failing
Nov 18 09:26:55 server004 dhclient[5179]: DHCPREQUEST on enp4s0 to 10.20.193.131 port 67 (xid=0x26458456)
-- DHCPREQUEST on enp4s0 repeatedly till 12:27 --
Nov 18 12:27:04 server004 dhclient[5179]: DHCPREQUEST on enp4s0 to 255.255.255.255 port 67 (xid=0x26458456)
Nov 18 12:27:10 server004 avahi-daemon[1869]: Withdrawing address record for 10.20.232.229 on enp4s0.
Nov 18 12:27:10 server004 avahi-daemon[1869]: Leaving mDNS multicast group on interface enp4s0.IPv4 with address 10.20.232.229.
Nov 18 12:27:10 server004 avahi-daemon[1869]: Interface enp4s0.IPv4 no longer relevant for mDNS.
Nov 18 12:27:11 server004 NetworkManager[2609]: <info>  [1605720431.3993] dhcp4 (enp4s0): state changed bound -> expire
Nov 18 12:27:11 server004 NetworkManager[2609]: <info>  [1605720431.4000] device (enp4s0): DHCPv4: 480 seconds grace period started
Nov 18 12:27:11 server004 NetworkManager[2609]: <info>  [1605720431.4106] dhcp4 (enp4s0): state changed expire -> unknown
Nov 18 12:27:11 server004 dhclient[5179]: DHCPDISCOVER on enp4s0 to 255.255.255.255 port 67 interval 5 (xid=0x1e6890d0)
Nov 18 12:27:16 server004 dhclient[5179]: DHCPDISCOVER on enp4s0 to 255.255.255.255 port 67 interval 9 (xid=0x1e6890d0)
Nov 18 12:27:25 server004 dhclient[5179]: DHCPDISCOVER on enp4s0 to 255.255.255.255 port 67 interval 10 (xid=0x1e6890d0)
Nov 18 12:27:35 server004 dhclient[5179]: DHCPDISCOVER on enp4s0 to 255.255.255.255 port 67 interval 15 (xid=0x1e6890d0)
Nov 18 12:27:50 server004 dhclient[5179]: DHCPDISCOVER on enp4s0 to 255.255.255.255 port 67 interval 15 (xid=0x1e6890d0)
Nov 18 12:28:05 server004 dhclient[5179]: DHCPDISCOVER on enp4s0 to 255.255.255.255 port 67 interval 7 (xid=0x1e6890d0)
Nov 18 12:28:12 server004 dhclient[5179]: No DHCPOFFERS received.
Nov 18 12:28:12 server004 dhclient[5179]: No working leases in persistent database - sleeping.
Nov 18 12:28:12 server004 NetworkManager[2609]: <info>  [1605720492.1971] dhcp4 (enp4s0): state changed unknown -> fail
Nov 18 12:32:08 server004 dhclient[5179]: DHCPDISCOVER on enp4s0 to 255.255.255.255 port 67 interval 8 (xid=0x47045c24)
Nov 18 12:32:16 server004 dhclient[5179]: DHCPDISCOVER on enp4s0 to 255.255.255.255 port 67 interval 13 (xid=0x47045c24)
Nov 18 12:32:29 server004 dhclient[5179]: DHCPDISCOVER on enp4s0 to 255.255.255.255 port 67 interval 20 (xid=0x47045c24)
Nov 18 12:32:49 server004 dhclient[5179]: DHCPDISCOVER on enp4s0 to 255.255.255.255 port 67 interval 14 (xid=0x47045c24)
Nov 18 12:33:03 server004 dhclient[5179]: DHCPDISCOVER on enp4s0 to 255.255.255.255 port 67 interval 6 (xid=0x47045c24)
Nov 18 12:33:09 server004 dhclient[5179]: No DHCPOFFERS received.
Nov 18 12:33:09 server004 dhclient[5179]: No working leases in persistent database - sleeping.
-- DHCPDISCOVER -> No DHCPOFFERS received -> DHCPDISCOVER happens repeatedly every 5 mins until 13:05 and then got DHCP
Nov 18 13:05:04 server004 dhclient[5179]: DHCPDISCOVER on enp4s0 to 255.255.255.255 port 67 interval 7 (xid=0x6a52e1ae)
Nov 18 13:05:11 server004 dhclient[5179]: DHCPDISCOVER on enp4s0 to 255.255.255.255 port 67 interval 14 (xid=0x6a52e1ae)
Nov 18 13:05:25 server004 dhclient[5179]: DHCPDISCOVER on enp4s0 to 255.255.255.255 port 67 interval 15 (xid=0x6a52e1ae)
Nov 18 13:05:40 server004 dhclient[5179]: DHCPDISCOVER on enp4s0 to 255.255.255.255 port 67 interval 17 (xid=0x6a52e1ae)
Nov 18 13:05:40 server004 dhclient[5179]: DHCPREQUEST on enp4s0 to 255.255.255.255 port 67 (xid=0x6a52e1ae)
Nov 18 13:05:40 server004 dhclient[5179]: DHCPOFFER from 10.20.232.1
Nov 18 13:05:40 server004 dhclient[5179]: DHCPACK from 10.20.232.1 (xid=0x6a52e1ae)
Nov 18 13:05:40 server004 NetworkManager[2609]: <info>  [1605722740.0476] dhcp4 (enp4s0):   address 10.20.232.229
Nov 18 13:05:40 server004 NetworkManager[2609]: <info>  [1605722740.0483] dhcp4 (enp4s0):   plen 22 (255.255.252.0)
Nov 18 13:05:40 server004 NetworkManager[2609]: <info>  [1605722740.0483] dhcp4 (enp4s0):   gateway 10.20.232.1
Nov 18 13:05:40 server004 NetworkManager[2609]: <info>  [1605722740.0483] dhcp4 (enp4s0):   lease time 18000
Nov 18 13:05:40 server004 NetworkManager[2609]: <info>  [1605722740.0483] dhcp4 (enp4s0):   nameserver '10.20.10.49'
Nov 18 13:05:40 server004 NetworkManager[2609]: <info>  [1605722740.0483] dhcp4 (enp4s0):   nameserver '10.20.10.48'
Nov 18 13:05:40 server004 NetworkManager[2609]: <info>  [1605722740.0484] dhcp4 (enp4s0):   domain name 'company.com'
Nov 18 13:05:40 server004 NetworkManager[2609]: <info>  [1605722740.0484] dhcp4 (enp4s0): state changed fail -> bound
Nov 18 13:05:40 server004 dhclient[5179]: bound to 10.20.232.229 -- renewal in 7548 seconds.
Nov 18 13:05:40 server004 avahi-daemon[1869]: Joining mDNS multicast group on interface enp4s0.IPv4 with address 10.20.232.229.
Nov 18 13:05:40 server004 NetworkManager[2609]: <info>  [1605722740.0519] policy: set 'enp4s0' (enp4s0) as default for IPv4 routing and DNS
Nov 18 13:05:40 server004 avahi-daemon[1869]: New relevant interface enp4s0.IPv4 for mDNS.
Nov 18 13:05:40 server004 avahi-daemon[1869]: Registering new address record for 10.20.232.229 on enp4s0.IPv4.
Nov 18 13:05:40 server004 dbus[1896]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service'
Nov 18 13:05:40 server004 systemd: Starting Network Manager Script Dispatcher Service...
Nov 18 13:05:40 server004 dbus[1896]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Nov 18 13:05:40 server004 systemd: Started Network Manager Script Dispatcher Service.
Nov 18 13:05:40 server004 nm-dispatcher: req:1 'dhcp4-change' [enp4s0]: new request (4 scripts)
Nov 18 13:05:40 server004 nm-dispatcher: req:1 'dhcp4-change' [enp4s0]: start running ordered scripts...
Nov 18 13:06:02 server004 NetworkManager[2609]: <info>  [1605722762.4182] policy: set 'enp4s0' (enp4s0) as default for IPv6 routing and DNS

答案1

您的 server003 记住了地址 192.168.2.72 的旧 DHCP 租约,该租约的有效期非常长,当 DHCP 服务器不可用时,它会返回到该地址。这甚至不与其收到的最后一个合法 IP 地址 10.20.193.131 位于同一网络上。

您应该清除该服务器的 DHCP 租约,然后重新启动 DHCP 客户端。

类似这样的操作应该做(以 root 身份),同时保持网络链接:

rm -f /var/lib/NetworkManager/*.lease; killall dhclient; nmcli device reapply enp4s0

相关内容