我有一个类似的问题Chrony 3.1 拒绝与 ntp 服务器同步
设想:
使用 SLES15 SP2 新安装的服务器正在运行 chrony 3.2。我配置了两个运行官方ntpd 4.2.8p15的NTP服务器池(都是内网)。
问题:
Chrony 从池中“拉”服务器,但它从未从服务器获得响应,我想知道为什么。是 chrony 的问题,ntpd 的问题,还是我的设置的问题?
调试:
(我正在使用 tcpdump 的黑客版本,它改进了 NTP 数据包解码)来自 ntpd 的请求看起来像这样(实际上它是一个选播请求,从远程监控):
10:22:29.373395 IP (tos 0xb8, ttl 4, id 21390, offset 0, flags [DF], proto UDP (17), length 100)
172.20.16.13.123 > 239.192.123.21.123: [udp sum ok] NTP leap indicator=0 (Nominal), Version=4, Mode=3 (Client), length=72
Stratum 2 (secondary reference), poll 6 (64s), precision -24
Root Delay: 0.000106, Root dispersion: 0.004196, Reference-ID: 0xac140219
Reference Timestamp: 3808714798.372973455 (2020-09-10T08:19:58.372973)
Originator Timestamp: 0.000000000
Receive Timestamp: 0.000000000
Transmit Timestamp: 3808714949.372178320 (2020-09-10T08:22:29.372178)
MAC: Key ID: 421, SHA1-Digest=48d73ad9 5b1d2401 9a8d3c02 91b849cb 28400475
相比之下,来自 chrony(本地监控)的查询如下所示:
08:52:33.338684 IP (tos 0x0, ttl 64, id 4141, offset 0, flags [DF], proto UDP (17), length 76)
h31.51625 > h03.ntp: [bad udp cksum 0x7894 -> 0xea6e!] NTPv4, length 48
Client, Leap indicator: (0), Stratum 0 (unspecified), poll 10 (1024s), precision 32
Root Delay: 0.000000, Root dispersion: 0.000000, Reference-ID: (unspec)
Reference Timestamp: 0.000000000
Originator Timestamp: 0.000000000
Receive Timestamp: 0.000000000
Transmit Timestamp: 502153526.517788040 (2052/01/06 06:33:42)
Originator - Receive Timestamp: 0.000000000
Originator - Transmit Timestamp: 502153526.517788040 (2052/01/06 06:33:42)
10:12:22.173989 IP (tos 0x0, ttl 64, id 58250, offset 0, flags [DF], proto UDP (17), length 76)
h31.39573 > nm1.ntp: [bad udp cksum 0x6a92 -> 0x02d5!] NTP leap indicator=0 (Nominal), Version=4, Mode=3 (Client), length=48
Stratum 0 (unspecified), poll 9 (512s), precision 32
Root Delay: 0.000000, Root dispersion: 0.000000, Reference-ID: 00000000
Reference Timestamp: 0.000000000
Originator Timestamp: 0.000000000
Receive Timestamp: 0.000000000
Transmit Timestamp: 1885145870.079837521 (2095-11-03T02:06:06.079838)
至少传输时间戳看起来很奇怪,而且我不知道其他字段是否有效。
问题可能是 chrony 的请求数据包,但也可能是服务器上的某些过滤导致请求被忽略。我已验证数据包至少到达一台池服务器,但我没有看到任何响应。
实际上,池外的一台服务器(显示的最后一个数据包中的服务器)会像这样响应,并保留奇数的发起者时间戳:
10:12:22.174191 IP (tos 0xb8, ttl 63, id 30184, offset 0, flags [DF], proto UDP (17), length 76)
nm1.ntp > h31.39573: [udp sum ok] NTP leap indicator=0 (Nominal), Version=4, Mode=4 (Server), length=48
Stratum 3 (secondary reference), poll 9 (512s), precision -23
Root Delay: 0.000518, Root dispersion: 0.025527, Reference-ID: 0xac141002
Reference Timestamp: 3808714309.712800696 (2020-09-10T08:11:49.712801)
Originator Timestamp: 1885145870.079837521 (2095-11-03T02:06:06.079838)
Receive Timestamp: 3808714342.174128206 (2020-09-10T08:12:22.174128)
Transmit Timestamp: 3808714342.174187417 (2020-09-10T08:12:22.174187)
更多调试信息
# chronyc -n
chrony version 3.2
Copyright (C) 1997-2003, 2007, 2009-2017 Richard P. Curnow and others
chrony comes with ABSOLUTELY NO WARRANTY. This is free software, and
you are welcome to redistribute it under certain conditions. See the
GNU General Public License version 2 for details.
chronyc> tracking
Reference ID : 00000000 ()
Stratum : 0
Ref time (UTC) : Thu Jan 01 00:00:00 1970
System time : 0.000000009 seconds slow of NTP time
Last offset : +0.000000000 seconds
RMS offset : 0.000000000 seconds
Frequency : 86.905 ppm slow
Residual freq : +0.000 ppm
Skew : 0.000 ppm
Root delay : 1.000000000 seconds
Root dispersion : 1.000000000 seconds
Update interval : 0.0 seconds
Leap status : Not synchronised
chronyc> sources
210 Number of sources = 8
MS Name/IP address Stratum Poll Reach LastRx Last sample
===============================================================================
^? 172.20.16.3 0 10 0 - +0ns[ +0ns] +/- 0ns
^? 172.20.16.1 0 10 0 - +0ns[ +0ns] +/- 0ns
^? 172.20.16.13 0 10 0 - +0ns[ +0ns] +/- 0ns
^? 172.20.16.14 0 10 0 - +0ns[ +0ns] +/- 0ns
^? 172.20.16.5 0 10 0 - +0ns[ +0ns] +/- 0ns
^? 172.20.16.12 0 10 0 - +0ns[ +0ns] +/- 0ns
^? 172.20.16.11 0 10 0 - +0ns[ +0ns] +/- 0ns
^- 172.20.2.1 3 10 377 667 +16.2s[ +16.2s] +/- 36ms
chronyc> sourcestats
210 Number of sources = 8
Name/IP Address NP NR Span Frequency Freq Skew Offset Std Dev
==============================================================================
172.20.16.3 0 0 0 +0.000 2000.000 +0ns 4000ms
172.20.16.1 0 0 0 +0.000 2000.000 +0ns 4000ms
172.20.16.13 0 0 0 +0.000 2000.000 +0ns 4000ms
172.20.16.14 0 0 0 +0.000 2000.000 +0ns 4000ms
172.20.16.5 0 0 0 +0.000 2000.000 +0ns 4000ms
172.20.16.12 0 0 0 +0.000 2000.000 +0ns 4000ms
172.20.16.11 0 0 0 +0.000 2000.000 +0ns 4000ms
172.20.2.1 22 10 232m -0.650 0.003 +16.2s 17us
chronyc> activity
200 OK
8 sources online
0 sources offline
0 sources doing burst (return to online)
0 sources doing burst (return to offline)
0 sources with unknown address
chronyc> ntpdata
Remote address : [UNSPEC] (00000000)
Remote port : 0
Local address : [UNSPEC] (00000000)
Leap status : Normal
Version : 0
Mode : Invalid
Stratum : 0
Poll interval : 0 (1 seconds)
Precision : 0 (1.000000000 seconds)
Root delay : 0.000000 seconds
Root dispersion : 0.000000 seconds
Reference ID : 00000000 ()
Reference time : Thu Jan 01 00:00:00 1970
Offset : +0.000000000 seconds
Peer delay : 0.000000000 seconds
Peer dispersion : 0.000000000 seconds
Response time : 0.000000000 seconds
Jitter asymmetry: +0.00
NTP tests : 000 000 0000
Interleaved : No
Authenticated : No
TX timestamping : Invalid
RX timestamping : Invalid
Total TX : 672
Total RX : 0
Total valid RX : 0
Remote address : [UNSPEC] (00000000)
Remote port : 0
Local address : [UNSPEC] (00000000)
Leap status : Normal
Version : 0
Mode : Invalid
Stratum : 0
Poll interval : 0 (1 seconds)
Precision : 0 (1.000000000 seconds)
Root delay : 0.000000 seconds
Root dispersion : 0.000000 seconds
Reference ID : 00000000 ()
Reference time : Thu Jan 01 00:00:00 1970
Offset : +0.000000000 seconds
Peer delay : 0.000000000 seconds
Peer dispersion : 0.000000000 seconds
Response time : 0.000000000 seconds
Jitter asymmetry: +0.00
NTP tests : 000 000 0000
Interleaved : No
Authenticated : No
TX timestamping : Invalid
RX timestamping : Invalid
Total TX : 672
Total RX : 0
Total valid RX : 0
Remote address : [UNSPEC] (00000000)
Remote port : 0
Local address : [UNSPEC] (00000000)
Leap status : Normal
Version : 0
Mode : Invalid
Stratum : 0
Poll interval : 0 (1 seconds)
Precision : 0 (1.000000000 seconds)
Root delay : 0.000000 seconds
Root dispersion : 0.000000 seconds
Reference ID : 00000000 ()
Reference time : Thu Jan 01 00:00:00 1970
Offset : +0.000000000 seconds
Peer delay : 0.000000000 seconds
Peer dispersion : 0.000000000 seconds
Response time : 0.000000000 seconds
Jitter asymmetry: +0.00
NTP tests : 000 000 0000
Interleaved : No
Authenticated : No
TX timestamping : Invalid
RX timestamping : Invalid
Total TX : 672
Total RX : 0
Total valid RX : 0
Remote address : [UNSPEC] (00000000)
Remote port : 0
Local address : [UNSPEC] (00000000)
Leap status : Normal
Version : 0
Mode : Invalid
Stratum : 0
Poll interval : 0 (1 seconds)
Precision : 0 (1.000000000 seconds)
Root delay : 0.000000 seconds
Root dispersion : 0.000000 seconds
Reference ID : 00000000 ()
Reference time : Thu Jan 01 00:00:00 1970
Offset : +0.000000000 seconds
Peer delay : 0.000000000 seconds
Peer dispersion : 0.000000000 seconds
Response time : 0.000000000 seconds
Jitter asymmetry: +0.00
NTP tests : 000 000 0000
Interleaved : No
Authenticated : No
TX timestamping : Invalid
RX timestamping : Invalid
Total TX : 672
Total RX : 0
Total valid RX : 0
Remote address : [UNSPEC] (00000000)
Remote port : 0
Local address : [UNSPEC] (00000000)
Leap status : Normal
Version : 0
Mode : Invalid
Stratum : 0
Poll interval : 0 (1 seconds)
Precision : 0 (1.000000000 seconds)
Root delay : 0.000000 seconds
Root dispersion : 0.000000 seconds
Reference ID : 00000000 ()
Reference time : Thu Jan 01 00:00:00 1970
Offset : +0.000000000 seconds
Peer delay : 0.000000000 seconds
Peer dispersion : 0.000000000 seconds
Response time : 0.000000000 seconds
Jitter asymmetry: +0.00
NTP tests : 000 000 0000
Interleaved : No
Authenticated : No
TX timestamping : Invalid
RX timestamping : Invalid
Total TX : 672
Total RX : 0
Total valid RX : 0
Remote address : [UNSPEC] (00000000)
Remote port : 0
Local address : [UNSPEC] (00000000)
Leap status : Normal
Version : 0
Mode : Invalid
Stratum : 0
Poll interval : 0 (1 seconds)
Precision : 0 (1.000000000 seconds)
Root delay : 0.000000 seconds
Root dispersion : 0.000000 seconds
Reference ID : 00000000 ()
Reference time : Thu Jan 01 00:00:00 1970
Offset : +0.000000000 seconds
Peer delay : 0.000000000 seconds
Peer dispersion : 0.000000000 seconds
Response time : 0.000000000 seconds
Jitter asymmetry: +0.00
NTP tests : 000 000 0000
Interleaved : No
Authenticated : No
TX timestamping : Invalid
RX timestamping : Invalid
Total TX : 672
Total RX : 0
Total valid RX : 0
Remote address : [UNSPEC] (00000000)
Remote port : 0
Local address : [UNSPEC] (00000000)
Leap status : Normal
Version : 0
Mode : Invalid
Stratum : 0
Poll interval : 0 (1 seconds)
Precision : 0 (1.000000000 seconds)
Root delay : 0.000000 seconds
Root dispersion : 0.000000 seconds
Reference ID : 00000000 ()
Reference time : Thu Jan 01 00:00:00 1970
Offset : +0.000000000 seconds
Peer delay : 0.000000000 seconds
Peer dispersion : 0.000000000 seconds
Response time : 0.000000000 seconds
Jitter asymmetry: +0.00
NTP tests : 000 000 0000
Interleaved : No
Authenticated : No
TX timestamping : Invalid
RX timestamping : Invalid
Total TX : 672
Total RX : 0
Total valid RX : 0
Remote address : 172.20.2.1 (AC140201)
Remote port : 123
Local address : 172.20.16.31 (AC14101F)
Leap status : Normal
Version : 4
Mode : Server
Stratum : 3
Poll interval : 10 (1024 seconds)
Precision : -23 (0.000000119 seconds)
Root delay : 0.000534 seconds
Root dispersion : 0.036041 seconds
Reference ID : AC141002 ()
Reference time : Thu Oct 08 08:20:28 2020
Offset : -16.152969360 seconds
Peer delay : 0.000214426 seconds
Peer dispersion : 0.000000195 seconds
Response time : 0.000017658 seconds
Jitter asymmetry: +0.23
NTP tests : 111 111 1111
Interleaved : No
Authenticated : No
TX timestamping : Daemon
RX timestamping : Daemon
Total TX : 1969
Total RX : 1969
Total valid RX : 1969
chronyc> clients
Hostname NTP Drop Int IntL Last Cmd Drop Int Last
===============================================================================
chronyc> serverstats
NTP packets received : 0
NTP packets dropped : 0
Command packets received : 81
Command packets dropped : 0
Client log records dropped : 0
chronyc> rtcdata
513 RTC driver not running
chronyc> quit
# journalctl -b SYSLOG_IDENTIFIER=chronyd
-- Logs begin at Wed 2020-09-30 13:32:17 CEST, end at Thu 2020-10-08 11:27:08 CEST. --
Sep 30 13:33:04 h31 chronyd[3522]: chronyd version 3.2 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP -SCFILTER +>
Sep 30 13:33:04 h31 chronyd[3522]: Enabled HW timestamping (TX only) on em3
Sep 30 13:33:04 h31 chronyd[3522]: Enabled HW timestamping (TX only) on em4
Sep 30 13:33:04 h31 chronyd[3522]: Frequency -86.905 +/- 0.107 ppm read from /var/lib/chrony/drift
答案1
我解决了这个问题,这个问题确实是 a 指令中的一个坏问题,mask
实际上导致 NTP 时间查询除了一台服务器之外的所有服务器都无法应答。另外我已经设置了。ntpd
restrict
minsources 3
/etc/chrony.conf
这个问题的有趣之处在于chronyd
处理它(参见“更多调试信息“有问题):
好的,在is
reach
的输出中可能表明了一堆不同的问题。sources
0
ntpdata
当实际上没有数据时输出大量数据。我错过的一个重要线索是Total RX
零,以及Total valid RX
。但这仍然可能有多种原因。serverstats
指示NTP packets received
为零似乎很奇怪,因为172.20.2.1
显然确实发送了响应。activity
说得8 sources online
似乎0 sources offline
很令人困惑:不响应的来源不应该被视为“离线”,而不是“在线”吗?
相比之下,这里是问题解决后的输出(有三个响应源):
Oct 08 11:29:32 h31 systemd[1]: Starting NTP client/server...
Oct 08 11:29:32 h31 chronyd[18823]: chronyd version 3.2 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP -SCFILTER >
Oct 08 11:29:32 h31 chronyd[18823]: Enabled HW timestamping (TX only) on em3
Oct 08 11:29:32 h31 chronyd[18823]: Enabled HW timestamping (TX only) on em4
Oct 08 11:29:32 h31 chronyd[18823]: Frequency -86.905 +/- 0.107 ppm read from /var/lib/chrony/drift
Oct 08 11:29:32 h31 systemd[1]: Started NTP client/server.
Oct 09 08:09:43 h31 chronyd[18823]: Selected source 172.20.2.1
Oct 09 08:09:43 h31 chronyd[18823]: System clock wrong by -16.101294 seconds, adjustment started
Oct 09 08:09:27 h31 chronyd[18823]: System clock was stepped by -16.101294 seconds
Oct 09 08:11:36 h31 chronyd[18823]: Selected source 172.20.16.3
chronyc> tracking
Reference ID : AC141003 (172.20.16.3)
Stratum : 3
Ref time (UTC) : Fri Oct 09 06:21:18 2020
System time : 0.000007615 seconds fast of NTP time
Last offset : +0.000007168 seconds
RMS offset : 0.000022300 seconds
Frequency : 87.841 ppm slow
Residual freq : +0.002 ppm
Skew : 0.090 ppm
Root delay : 0.000269273 seconds
Root dispersion : 0.002195312 seconds
Update interval : 64.6 seconds
Leap status : Normal
chronyc> sources
210 Number of sources = 9
MS Name/IP address Stratum Poll Reach LastRx Last sample
===============================================================================
^? 172.20.16.13 0 10 0 - +0ns[ +0ns] +/- 0ns
^? 172.20.16.1 0 10 0 - +0ns[ +0ns] +/- 0ns
^? 172.20.16.5 0 10 0 - +0ns[ +0ns] +/- 0ns
^? 172.20.16.12 0 10 0 - +0ns[ +0ns] +/- 0ns
^? 172.20.16.14 0 10 0 - +0ns[ +0ns] +/- 0ns
^? 172.20.16.11 0 10 0 - +0ns[ +0ns] +/- 0ns
^- 172.20.2.1 3 9 377 239 +15us[ +27us] +/- 27ms
^- 172.20.16.2 2 8 377 65 +208us[ +215us] +/- 8147us
^* 172.20.16.3 2 6 377 64 +27us[ +34us] +/- 4417us
chronyc> sourcestats
210 Number of sources = 9
Name/IP Address NP NR Span Frequency Freq Skew Offset Std Dev
==============================================================================
172.20.16.13 0 0 0 +0.000 2000.000 +0ns 4000ms
172.20.16.1 0 0 0 +0.000 2000.000 +0ns 4000ms
172.20.16.5 0 0 0 +0.000 2000.000 +0ns 4000ms
172.20.16.12 0 0 0 +0.000 2000.000 +0ns 4000ms
172.20.16.14 0 0 0 +0.000 2000.000 +0ns 4000ms
172.20.16.11 0 0 0 +0.000 2000.000 +0ns 4000ms
172.20.2.1 7 5 51m +0.254 0.070 +105us 23us
172.20.16.2 6 3 21m +0.219 0.218 +227us 27us
172.20.16.3 15 7 907 +0.002 0.074 +52ns 19us
chronyc> activity
200 OK
9 sources online
0 sources offline
0 sources doing burst (return to online)
0 sources doing burst (return to offline)
0 sources with unknown address
chronyc> ntpdata
...
Remote address : 172.20.2.1 (AC140201)
Remote port : 123
Local address : 172.20.16.31 (AC14101F)
Leap status : Normal
Version : 4
Mode : Server
Stratum : 3
Poll interval : 9 (512 seconds)
Precision : -23 (0.000000119 seconds)
Root delay : 0.000366 seconds
Root dispersion : 0.026947 seconds
Reference ID : AC14100E ()
Reference time : Fri Oct 09 06:11:14 2020
Offset : -0.000026963 seconds
Peer delay : 0.000219559 seconds
Peer dispersion : 0.000000190 seconds
Response time : 0.000020624 seconds
Jitter asymmetry: +0.20
NTP tests : 111 111 1111
Interleaved : No
Authenticated : No
TX timestamping : Daemon
RX timestamping : Daemon
Total TX : 297
Total RX : 296
Total valid RX : 296
Remote address : 172.20.16.2 (AC141002)
Remote port : 123
Local address : 172.20.16.31 (AC14101F)
Leap status : Normal
Version : 4
Mode : Server
Stratum : 2
Poll interval : 8 (256 seconds)
Precision : -23 (0.000000119 seconds)
Root delay : 0.000305 seconds
Root dispersion : 0.007904 seconds
Reference ID : AC140219 ()
Reference time : Fri Oct 09 06:14:48 2020
Offset : -0.000215189 seconds
Peer delay : 0.000180311 seconds
Peer dispersion : 0.000000190 seconds
Response time : 0.000057180 seconds
Jitter asymmetry: +0.50
NTP tests : 111 111 1111
Interleaved : No
Authenticated : Yes
TX timestamping : Daemon
RX timestamping : Daemon
Total TX : 466
Total RX : 453
Total valid RX : 453
Remote address : 172.20.16.3 (AC141003)
Remote port : 123
Local address : 172.20.16.31 (AC14101F)
Leap status : Normal
Version : 4
Mode : Server
Stratum : 2
Poll interval : 6 (64 seconds)
Precision : -24 (0.000000060 seconds)
Root delay : 0.000168 seconds
Root dispersion : 0.006165 seconds
Reference ID : AC140219 ()
Reference time : Fri Oct 09 06:18:14 2020
Offset : -0.000028130 seconds
Peer delay : 0.000198109 seconds
Peer dispersion : 0.000000131 seconds
Response time : 0.000038736 seconds
Jitter asymmetry: +0.00
NTP tests : 111 111 1111
Interleaved : No
Authenticated : No
TX timestamping : Daemon
RX timestamping : Daemon
Total TX : 16
Total RX : 16
Total valid RX : 16
chronyc> serverstats
NTP packets received : 0
NTP packets dropped : 0
Command packets received : 353
Command packets dropped : 0
Client log records dropped : 0
chronyc> rtcdata
513 RTC driver not running
似乎chronyd
或中存在一些错误chronyc
。