ntp 服务器拒绝我的虚拟机，但网络看起来不错

2024-6-2 • tag-icon

最近我域中总有几台 SLES12.5 VM 出现 NTP 同步问题。所以我对此进行了一些研究。以下是详细信息--

我发现 1 台虚拟机经常出现 NTP 问题。因此，我通过每秒运行“ntpq -pn”来启动监控作业。昨天，我发现它再次与 NTP 服务器失去同步——

所有 ntp 服务器自 2022-07-22T05:16:34 起均无响应，并且 tcpdump 也证实了这一点 -- 从那一刻起 -- 没有从 ntp 服务器发送回该 VM 的数据包...

所以我用 ntpq 命令检查了一下——

vsa10027077:/tmp/eisen # ntpq -pn
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
*127.127.1.0     .LOCL.          10 l   18   64  377    0.000   +0.000   0.000
 147.204.9.202   162.159.200.1    4 u   5h 1024    0    2.168   -0.374   0.000
 147.204.9.203   162.159.200.123  4 u   5h 1024    0    2.411   +1.608   0.000
 147.204.9.204   162.159.200.1    4 u   5h 1024    0    1.917   -0.418   0.000

vsa10027077:/tmp/eisen # ntpq
ntpq> as
ind assid status  conf reach auth condition  last_event cnt
===========================================================
  1 26549  961a   yes   yes  none  sys.peer    sys_peer  1
  2 26550  8013   yes    no  none    reject unreachable  1
  3 26551  8013   yes    no  none    reject unreachable  1
  4 26552  8013   yes    no  none    reject unreachable  1
ntpq> rv 26550
associd=26550 status=8013 conf, sel_reject, 1 event, unreachable,
srcadr=147.204.9.202, srcport=123, dstadr=100.78.59.192, dstport=123,
leap=00, stratum=4, precision=-23, rootdelay=22.659, rootdisp=38.574,
refid=162.159.200.1,
reftime=e684ba76.20e3a34f  Fri, Jul 22 2022  5:56:06.128,
rec=e684bf98.7a92b5e4  Fri, Jul 22 2022  6:18:00.478, reach=000,
unreach=28, hmode=3, pmode=4, hpoll=10, ppoll=10, headway=44,
flash=1400 peer_dist, peer_unreach, keyid=0, offset=-0.374, delay=2.168,
dispersion=15937.500, jitter=0.000, xleave=0.071,
filtdelay=     0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00,
filtoffset=   +0.00   +0.00   +0.00   +0.00   +0.00   +0.00   +0.00   +0.00,
filtdisp=   16000.0 16000.0 16000.0 16000.0 16000.0 16000.0 16000.0 16000.0

所有闪存都是 1400 – 意味着 ntp 服务器 – 1000 – 无法访问或未选择 400 – 超出距离阈值

由于 ntpq 说 ntp 服务器与我的虚拟机的距离太长，所以我用 ping 和 traceroute 检查了一下——

ping 显示 ttl 仅为 252，延迟时间仅为 1.35ms，没有数据包丢失，并且 traceroute 显示从客户端到 ntp 服务器只有 4 跳 --

vsa10027077:/tmp/eisen # traceroute 147.204.9.202
traceroute to 147.204.9.202 (147.204.9.202), 30 hops max, 60 byte packets
 1  host-100-78-56-1.fra1.od.sap.biz (100.78.56.1)  0.332 ms  0.316 ms  0.309 ms
 2  130.214.162.65 (130.214.162.65)  0.829 ms  1.317 ms  1.047 ms
 3  10.46.210.132 (10.46.210.132)  1.014 ms  1.278 ms 10.46.210.131 (10.46.210.131)  1.166 ms
 4  10.46.210.129 (10.46.210.129)  3.102 ms * *

因此，我尝试在停止 ntpd 服务后通过“ntpdate”手动重置时间——偏移量看起来非常小——然后重新启动 ntpd 服务——但遗憾的是发现 ntp 服务器仍然拒绝该虚拟机——

vsa10027077:/tmp/eisen#systemctl 停止 ntpd

vsa10027077:/tmp/eisen#ntpdate 147.204.9.202

7月22日 11:33:37 ntpdate[30877]: 调整时间服务器 147.204.9.202 偏移量 +0.000069 秒

vsa10027077:/tmp/eisen#systemctl 启动 ntpd
然后我在 /etc/ntp.conf 中的每个 ntp 服务器行中添加“minpool 3 maxpoll 6”并重新启动 ntpd 服务，但仍然不起作用。

我很困惑——ntp 服务器说我的虚拟机距离太长，所以拒绝它，但 ping 和 traceroute 都显示它们之间的跳数很小。是什么导致了这个问题？ntp 服务器如何确定与客户端的距离？如何修复它？请分享您的评论。提前感谢您的帮助。

更新 -

ntpd 的配置文件是——

vsa10027077：〜#cat /etc/ntp.conf

driftfile /var/lib/ntp/drift/ntp.drift
logfile   /var/log/ntp

server 127.127.1.0
fudge  127.127.1.0 stratum 10

server timehost1.global.cloud.sap 
server timehost2.global.cloud.sap 
server timehost3.global.cloud.sap 

# key configuration
keys /etc/ntp.keys
trustedkey 1
requestkey 1
controlkey 1

# by default act only as a basic NTP client
restrict default kod nomodify noquery notrap nopeer
restrict -6 default kod nomodify noquery notrap nopeer
restrict 127.0.0.1
restrict ::1
# allow NTP messages from the loopback address, useful for debugging
restrict localhost

### end of file

但是，由于最近两天 ntp 服务并没有出现该服务器无响应的问题，因此无法收集到问题发生时的“ntpq -c rv 0”的输出，下面是正常时间的输出——

vsa10027077:~ # ntpq -c rv 0
associd=0 status=0615 leap_none, sync_ntp, 1 event, clock_sync,
version="ntpd [email protected] Mon Jun 21 18:17:38 UTC 2021 (1)",
processor="x86_64", system="Linux/4.12.14-122.124-default", leap=00,
stratum=5, precision=-24, rootdelay=26.314, rootdisp=51.471,
refid=147.204.9.204,
reftime=e689d98d.602a4dc4  Tue, Jul 26 2022  3:10:05.375,
clock=e689d9d4.bea84735  Tue, Jul 26 2022  3:11:16.744, peer=2989, tc=5,
mintc=3, offset=+0.212857, frequency=+2.033, sys_jitter=0.876471,
clk_jitter=0.843, clk_wander=0.063

请看一下。谢谢

更新于 2022-08-09——在 /etc/ntp.conf 中的所有 ntp 服务器行中添加了“minpolls 3 maxpolls 6”，然后重新启动 ntpd。仍然出现拒绝问题，但持续时间比以前短得多——以前是 30 多个小时，现在只有 3 个小时，主机将恢复正常。但是——仍然很困惑——我已将“最大轮询”设置为 6，这意味着最大轮询应该是 64 秒。但是当我检查 ntpq 时——它已经是 256...

vsa9973928:/tmp/eisen # cat /etc/ntp.conf

driftfile /var/lib/ntp/drift/ntp.drift
logfile   /var/log/ntp

server 127.127.1.0
fudge  127.127.1.0 stratum 10

server timehost1.global.cloud.sap minpoll 3 maxpoll 6
server timehost2.global.cloud.sap minpoll 3 maxpoll 6
server timehost3.global.cloud.sap minpoll 3 maxpoll 6

# key configuration
keys /etc/ntp.keys
trustedkey 1
requestkey 1
controlkey 1

# by default act only as a basic NTP client
restrict default kod nomodify noquery notrap nopeer
restrict -6 default kod nomodify noquery notrap nopeer
restrict 127.0.0.1
restrict ::1
# allow NTP messages from the loopback address, useful for debugging
restrict localhost

### end of file
vsa9973928:/tmp/eisen # ntpq -pn
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 127.127.1.0     .LOCL.          10 l 220m   64    0    0.000   +0.000   0.000
+147.204.9.202   10.46.141.8      5 u   40  512  377    1.742   +0.128   1.060
+147.204.9.203   162.159.200.123  4 u  274  512  377    1.730   +1.539   2.245
*147.204.9.204   162.159.200.1    4 u  148  512  377    1.803   +0.585   0.900

什么问题导致轮询间隔超出 ntp.conf 中的限制？有人见过这种情况吗？

相关内容