我们有一些嵌入式设备使用 ntpd(4.2.8p10) 来同步时间。我们的一位客户正在内部网络中使用他们自己的 ntp 服务器。从ntpd -dgq调试模式,我们发现服务器是可达的,并且我们可以获得偏移、延迟和抖动信息。但是,ntpd 只会以“ntpd:未找到服务器"并且切勿选择和设置当地时间。
2 Nov 11:57:05 ntpd[20218]: ntpd [email protected] Thu Jul 26 19:52:20 UTC 2018 (2): Starting
2 Nov 11:57:05 ntpd[20218]: Command line: ntpd -dgq
2 Nov 11:57:05 ntpd[20218]: proto: precision = 2.000 usec (-19)
Finished Parsing!!
restrict: op 1 addr 0.0.0.0 mask 0.0.0.0 mflags 00000000 flags 000005f0
restrict: op 1 addr 127.0.0.1 mask 255.255.255.255 mflags 00000000 flags 00000000
restrict source template mflags 4000 flags 1c0
restrict: op 1 addr (null) mask (null) mflags 00004000 flags 000001c0
move_fd: estimated max descriptors: 1024, initial socket boundary: 16
2 Nov 11:57:05 ntpd[20218]: Listen and drop on 0 v4wildcard 0.0.0.0:123
2 Nov 11:57:05 ntpd[20218]: Listen normally on 1 lo 127.0.0.1:123
restrict: op 1 addr 127.0.0.1 mask 255.255.255.255 mflags 00003000 flags 00000001
2 Nov 11:57:05 ntpd[20218]: Listen normally on 2 eth1 192.168.168.109:123
restrict: op 1 addr 192.168.168.109 mask 255.255.255.255 mflags 00003000 flags 00000001
2 Nov 11:57:05 ntpd[20218]: Listen normally on 3 wlan0 192.168.100.1:123
restrict: op 1 addr 192.168.100.1 mask 255.255.255.255 mflags 00003000 flags 00000001
2 Nov 11:57:05 ntpd[20218]: Listening on routing socket on fd #27 for interface updates
key_expire: at 0 associd 60163
peer_clear: at 0 next 1 associd 60163 refid INIT
restrict: op 1 addr 10.160.129.161 mask 255.255.255.255 mflags 00004000 flags 000001c0
restrict_source: 10.160.129.161 host restriction added
event at 0 10.160.129.161 8011 81 mobilize assoc 60163
newpeer: 192.168.168.109->10.160.129.161 mode 3 vers 4 poll 6 10 flags 0x101 0x1 ttl 0 key 00000000
event at 0 0.0.0.0 c016 06 restart
peer_xmit: at 1 192.168.168.109->10.160.129.161 mode 3 len 48 xmt 0xe52bde52.ddf3c87c
auth_agekeys: at 1 keys 0 expired 0
event at 1 10.160.129.161 8014 84 reachable
clock_filter: n 1 off 30.082946 del 0.048598 dsp 7.945314 jit 0.000002
peer_xmit: at 3 192.168.168.109->10.160.129.161 mode 3 len 48 xmt 0xe52bde54.ddf0a416
clock_filter: n 2 off 30.083616 del 0.047583 dsp 3.949228 jit 0.000670
peer_xmit: at 5 192.168.168.109->10.160.129.161 mode 3 len 48 xmt 0xe52bde56.dde968ab
clock_filter: n 3 off 30.078398 del 0.054469 dsp 1.951189 jit 0.004895
peer_xmit: at 7 192.168.168.109->10.160.129.161 mode 3 len 48 xmt 0xe52bde58.dde80026
clock_filter: n 4 off 30.079499 del 0.074539 dsp 0.952172 jit 0.003164
peer_xmit: at 9 192.168.168.109->10.160.129.161 mode 3 len 48 xmt 0xe52bde5a.ddea03c8
clock_filter: n 5 off 30.083616 del 0.044472 dsp 0.452664 jit 0.003340
2 Nov 11:57:16 ntpd[20218]: ntpd: no servers found
END OF FILE
另外,当在后台运行 ntpd 并使用ntpq-p查询ntpd状态。我们得到以下结果,st、delay、offset 和reach 看起来都不错。
root@S8P20092901:~# ntpq -c as
ind assid status conf reach auth condition last_event cnt
===========================================================
1 59609 9014 yes yes none reject reachable 1
root@S8P20092901:~# ntpq -np
remote refid st t when poll reach delay offset jitter
==============================================================================
10.160.129.161 162.159.200.123 4 u 24 64 377 40.404 -180.122 20.122
然而,ntpd永远不会选择ntp服务器作为时间源(永远不会在远程地址前显示“*”或“+”)或在长时间等待后设置本地时间。
我查看了源代码。当使用 ntpdate(-q) 模式时,当没有选择/设置时钟时,ntpd 将在为每个服务器执行所有突发操作后退出
} else {
peer->burst--;
if (peer->burst == 0) {
/*
* If ntpdate mode and the clock has not been
* set and all peers have completed the burst,
* we declare a successful failure.
*/
if (mode_ntpdate) {
peer_ntpdate--;
if (peer_ntpdate == 0) {
msyslog(LOG_NOTICE,
"ntpd: no servers found");
if (!msyslog_term)
printf(
"ntpd: no servers found\n");
exit (0);
}
}
}
}
但是,我仍然不明白为什么 ntpd 没有从服务器选择并设置时间。提前感谢您的帮助。
答案1
这看起来可能是根本分散问题(从时间源到服务器的累积误差)。
您ntpq -nc associations
已经提供了:
ind assid status conf reach auth condition last_event cnt =========================================================== 1 59609 9014 yes yes none reject reachable 1
所以现在需要的是显示这个有问题的关联的详细信息:
ntpq -nc 'readvar 59609'
你应该得到类似这样的东西(取自我自己的 NTP 服务器)
associd=33428 status=142a reach, sel_candidate, 2 events, sys_peer,
srcadr=90.255.244.219, srcport=123, dstadr=192.168.1.18, dstport=123,
leap=00, stratum=1, precision=-20, rootdelay=0.000, rootdisp=1.511,
refid=PPS, reftime=e53ca0fb.4d946a30 Mon, Nov 15 2021 9:03:55.303,
rec=e53ca11e.bf1413cd Mon, Nov 15 2021 9:04:30.746, reach=377,
unreach=0, hmode=3, pmode=4, hpoll=10, ppoll=10, headway=0, flash=00 ok,
keyid=0, offset=-0.249, delay=22.177, dispersion=55.975, jitter=56.489,
xleave=0.088,
filtdelay= 157.46 161.45 169.05 22.18 21.68 21.76 186.40 22.04,
filtoffset= 70.21 70.72 74.51 -0.25 -0.03 -0.26 81.90 -0.34,
filtdisp= 0.00 15.39 31.02 47.04 63.23 79.22 86.97 94.76
寻找rootdisp
价值。我希望你会发现你的值很高,说明从时间源到这里的路径误差太大。除了使用不同的上游服务器之外,您无能为力。 (您可以修复,maxdisp
但如果必须这样做,您必须询问您的上游服务器到底有多可靠。)
参考:
- 思科——对 Microsoft Windows 上的 ISE 和 NTP 服务器同步故障进行故障排除(PDF)
- NTP-参考文档
ntpq
- 服务器故障 -为什么 NTP 认为我的服务器不足?
答案2
我遇到了同样的问题这个帖子。
解决方案确实是添加到tos maxdist 30
并/etc/ntp.conf
在下面列出了检查和解决它的所有步骤。请注意,只有在没有其他时间服务器选项的情况下才应执行此操作:正如其他人所述,这也意味着上游 NTP 服务器并不真正可靠。
步骤如下:
如果您使用ntpd -dgq
,您可能会收到unable to bind to wildcard address
错误。因此,在运行之前,您需要停止 NTP 服务service ntp stop
或终止持有 NTP 的进程:
lsof -i | grep ntp
kill <pid>
之后,运行ntpd -dgq
命令。如果您收到日志的最后部分,则说明 NTP 服务器无法访问:
...
...
...
receive: MATCH_ASSOC dispatch: mode 4/server:AM_PROCPKT
filegen 2 3854076120
clock_filter: n 5 off 3.839496 del 0.000455 dsp 0.437525 jit 0.000248
17 Feb 09:42:02 ntpd[1040]: ntpd: no servers found
此外,重新启动 NTP 服务 ( service ntp start
) 后,使用以下命令可以看到同样的情况 - 服务器可以访问,但无法执行时间同步:
root@akulab1:~# ntpq -c as
ind assid status conf reach auth condition last_event cnt
===========================================================
1 34463 9014 yes yes none reject reachable 1
root@akulab1:~# ntpq -np
remote refid st t when poll reach delay offset jitter
==============================================================================
172.16.0.25 .LOCL. 1 u 36 64 7 0.579 3917.57 5.842
如上所述,原因是rootdisp
下面的输出值很大(使用assid
fromntpq -c as
作为输入readvar
):
root@akulab1:~# ntpq -nc 'readvar 34463'
associd=34463 status=9014 conf, reach, sel_reject, 1 event, reachable,
srcadr=172.16.0.25, srcport=123, dstadr=172.16.0.133, dstport=123,
leap=00, stratum=1, precision=-23, rootdelay=0.000, rootdisp=10684.280,
refid=LOCL, reftime=e5b7a483.b4d87c2d Wed, Feb 16 2022 17:27:47.706,
rec=e5b88b72.a5c64a66 Thu, Feb 17 2022 9:53:06.647, reach=377,
unreach=0, hmode=3, pmode=4, hpoll=6, ppoll=6, headway=290,
flash=400 peer_dist, keyid=0, offset=3934.131, delay=0.516,
dispersion=0.987, jitter=14.653, xleave=0.037,
filtdelay= 0.52 0.52 0.56 0.54 0.54 0.58 0.45 0.49,
filtoffset= 3934.13 3930.84 3927.44 3924.11 3920.90 3917.58 3914.35 3911.63,
filtdisp= 0.00 1.02 2.06 3.08 4.10 5.12 6.11 6.95
tos maxdist 30
最后,这些是添加/etc/ntp.conf
和重新启动 NTP 服务的命令:
echo 'tos maxdist 30' >> /etc/ntp.conf
service ntp restart
并且,瞧 - 时间已成功与您的 NTP 服务器同步:
root@akulab1:~# ntpq -c as
ind assid status conf reach auth condition last_event cnt
===========================================================
1 60446 961a yes yes none sys.peer sys_peer 1
root@akulab1:~# ntpq -np
remote refid st t when poll reach delay offset jitter
==============================================================================
*172.16.0.25 .LOCL. 1 u 15 64 1 0.432 0.314 0.171