我安装了一个新的 NTP 服务器,并配置了所有 Hadoop 客户端来使用它。Hadoop 工程师告诉我,集群必须有最大 3 秒的偏移量才能正常工作。
该服务器的 NTP 配置在所有服务器上都是相同的,因为它是通过 Puppet 安装和配置的,并且所有服务器都使用相同的 NTP 服务器,并且它们都位于同一个网段中。
当我ntpq -p
在所有服务器上运行时,我发现某些服务器存在很大差异,并且偏移量太高。
例子:
hadoop-dn01.company.com
remote refid st t when poll reach delay offset jitter
==============================================================================
*adnj12.domain. 10.31.0.12 3 u 90 128 377 0.372 -9.163 24.699
hadoop-dn02.company.com
remote refid st t when poll reach delay offset jitter
==============================================================================
*adnj12.domain. 10.31.0.12 3 u 19 64 377 0.367 6.632 6.050
hadoop-dn03.company.com
remote refid st t when poll reach delay offset jitter
==============================================================================
*adnj12.domain. 10.31.0.12 3 u 2 64 377 0.330 1.191 8.421
hadoop-dn04.company.com
remote refid st t when poll reach delay offset jitter
==============================================================================
*adnj12.domain. 10.31.0.12 3 u 40 64 377 0.367 11.323 8.563
hadoop-dn05.company.com
remote refid st t when poll reach delay offset jitter
==============================================================================
*adnj12.domain. 10.31.0.12 3 u 9 64 377 0.329 7.353 7.845
hadoop-dn06.company.com
remote refid st t when poll reach delay offset jitter
==============================================================================
*adnj12.domain. 10.31.0.12 3 u 56 64 377 0.317 -0.919 6.757
hadoop-dn07.company.com
remote refid st t when poll reach delay offset jitter
==============================================================================
*adnj12.domain. 10.31.0.12 3 u 24 64 377 0.405 -12.100 9.447
hadoop-dn08.company.com
remote refid st t when poll reach delay offset jitter
==============================================================================
*adnj12.domain. 10.31.0.12 3 u 62 64 377 1.539 3.186 8.965
hadoop-jn01.company.com
remote refid st t when poll reach delay offset jitter
==============================================================================
*adnj12.domain. 10.31.0.12 3 u 12 64 37 0.446 5.457 3.623
hadoop-jn02.company.com
remote refid st t when poll reach delay offset jitter
==============================================================================
*adnj12.domain. 10.31.0.12 3 u 50 64 17 0.679 -3.492 3.632
hadoop-nn01.company.com
remote refid st t when poll reach delay offset jitter
==============================================================================
*adnj12.domain. 10.31.0.12 3 u 23 64 17 0.642 5.943 3.939
hadoop-nn02.company.com
remote refid st t when poll reach delay offset jitter
==============================================================================
*adnj12.domain. 10.31.0.12 3 u 4 64 17 0.664 8.031 5.690
造成偏移量差异的原因可能是什么?我该如何减少这种偏移量?