我在工作中得到了一个旧的 Bind 系统,主服务器上的区域无法与从服务器上同步。我是 Bind 方面的菜鸟,确实需要帮助。我希望主服务器上所做的所有更改都能同步到从服务器上。
服务器可以互相访问(ping、ssh,两者之间完全开放)。服务器有点旧了,我不允许更新,因为担心可能会出问题。
Ubuntu 12.04.5 LTS BIND 9.8.1-P1
主站 = ns1..com。从站 = ns2..com。
我们可以使用绑定服务器,它们可以正常运行,只是不会复制更改。
据说大多数更改都是通过 GUI 进行的,但我无法访问它。
问题可能是在主服务器的 IP 更改期间开始的,至少那时发现了问题,但没有人确切知道。
已重启服务、刷新缓存、重启服务器。我检查了配置,但从我所看到的来看它应该是正确的。尝试了 rndc --retransfer ,但它没有输出并且不起作用。
rndc status
给出以下输出:
version: 9.8.1-P1
CPUs found: 1
worker threads: 1
number of zones: 296
debug level: 0
xfers running: 0
xfers deferred: 0
soa queries in progress: 0
query logging is OFF
recursive clients: 0/0/1000
tcp clients: 0/100
server is up and running
MASTER 和 SLAVE(配置相同,只有密钥不同)
/etc/bind/named.conf
// This is the primary configuration file for the BIND DNS server named.
//
// Please read /usr/share/doc/bind9/README.Debian.gz for information on the
// structure of BIND configuration files in Debian, *BEFORE* you customize
// this configuration file.
//
// If you are just adding zones, please do that in /etc/bind/named.conf.local
include "/etc/bind/named.conf.options";
include "/etc/bind/named.conf.local";
include "/etc/bind/named.conf.default-zones";
key rndc-key {
algorithm hmac-md5;
secret "UHSoHPGEh+p5kIdoGzoX0A==";
};
controls {
inet 127.0.0.1 port 953 allow { 127.0.0.1; } keys { rndc-key; };
};
主
/etc/bind/named.conf.options
options {
directory "/var/cache/bind";
// If there is a firewall between you and nameservers you want
// to talk to, you may need to fix the firewall to allow multiple
// ports to talk. See http://www.kb.cert.org/vuls/id/800113
// If your ISP provided one or more IP addresses for stable
// nameservers, you probably want to use them as forwarders.
// Uncomment the following block, and insert the addresses replacing
// the all-0's placeholder.
// forwarders {
// 0.0.0.0;
// };
//========================================================================
// If BIND logs error messages about the root key being expired,
// you will need to update your keys. See https://www.isc.org/bind-keys
//========================================================================
dnssec-validation auto;
auth-nxdomain yes;
listen-on-v6 { any; };
recursion no;
multiple-cnames yes;
fetch-glue yes;
check-names master fail;
check-names slave fail;
allow-transfer { localhost; <IP-OF-SLAVE>; };
notify yes;
dump-file "/";
also-notify {
};
};
奴隶
/etc/bind/named.conf.options
options {
directory "/var/cache/bind";
// If there is a firewall between you and nameservers you want
// to talk to, you may need to fix the firewall to allow multiple
// ports to talk. See http://www.kb.cert.org/vuls/id/800113
// If your ISP provided one or more IP addresses for stable
// nameservers, you probably want to use them as forwarders.
// Uncomment the following block, and insert the addresses replacing
// the all-0's placeholder.
// forwarders {
// 0.0.0.0;
// };
//========================================================================
// If BIND logs error messages about the root key being expired,
// you will need to update your keys. See https://www.isc.org/bind-keys
//========================================================================
dnssec-validation auto;
auth-nxdomain yes;
listen-on-v6 { any; };
recursion no;
multiple-cnames yes;
fetch-glue yes;
allow-transfer { <MASTER IP>; };
//allow-transfer { ns1.<our-domain>.com; };
//also-notify {};
};
主
/etc/bind/named.conf.local
//
// Do any local configuration here
//
// Consider adding the 1918 zones here, if they are not used in your
// organization
//include "/etc/bind/zones.rfc1918";
zone "domain.nu" {
type master;
file "/var/lib/bind/<DOMAIN>.nu.hosts";
allow-transfer {
<IP-OF-SLAVE>;
};
};
这里有数百个区域,所有区域配置都相同。
奴隶
/etc/bind/named.conf.local
zone "domain.nu" {
type slave;
masters {
<IP-MASTER>;
};
file "/var/lib/bind/domain.nu.hosts";
allow-transfer {
<IP-MASTER>;
};
};
主
/etc/bind/named.conf.default-zones
// prime the server with knowledge of the root servers
zone "." {
type hint;
file "/etc/bind/db.root";
};
// be authoritative for the localhost forward and reverse zones, and for
// broadcast zones as per RFC 1912
zone "localhost" {
type master;
file "/etc/bind/db.local";
};
zone "127.in-addr.arpa" {
type master;
file "/etc/bind/db.127";
};
zone "0.in-addr.arpa" {
type master;
file "/etc/bind/db.0";
};
zone "255.in-addr.arpa" {
type master;
file "/etc/bind/db.255";
};
奴隶
/etc/bind/named.conf.default-zones
// prime the server with knowledge of the root servers
zone "." {
type hint;
file "/etc/bind/db.root";
};
// be authoritative for the localhost forward and reverse zones, and for
// broadcast zones as per RFC 1912
zone "localhost" {
type master;
file "/etc/bind/db.local";
};
zone "127.in-addr.arpa" {
type master;
file "/etc/bind/db.127";
};
zone "0.in-addr.arpa" {
type master;
file "/etc/bind/db.0";
};
zone "255.in-addr.arpa" {
type master;
file "/etc/bind/db.255";
};
除了这个配置之外,我们还可以在 /var/lib/bind/.hosts 中找到不同区域的配置。它们看起来有点不同,具体取决于它们是在 MASTER 上还是在 SLAVE 上
主
/var/lib/bind/.hosts
$ttl 38400
domain.com. IN SOA ns1.<our domain>.com. admin.<our domain>.com.. (
1373899259
7200
3600
604800
38400 )
<domain.com>. IN NS ns1.<our domain>.com.
<domain.com>. IN NS ns2.<our domain>.com.
<domain.com>. IN A <customer ip>
www.<domain.com>. IN A <customer ip>
_autodiscover._tcp.domain.com. IN SRV 0 0 443 autodiscover.<our-domain>.com.
<domain.com>. IN MX 10 <mx-record>.com.
<domain.com>. IN MX 20 <mx-record>.net.
奴隶
/var/lib/bind/some-domain.com.hosts
$ORIGIN .
$TTL 38400 ; 10 hours 40 minutes
domain.com IN SOA ns1.<our domain>.se. admin.<our domain>.com. (
1373899259 ; serial
7200 ; refresh (2 hours)
3600 ; retry (1 hour)
604800 ; expire (1 week)
38400 ; minimum (10 hours 40 minutes)
)
NS ns1.<our domain>.com.
NS ns2.<our domain>.com.
A 212.247.229.60
MX 10 <mx>.com.
MX 20 <mx>.net.
$ORIGIN <DOMAIN.COM>.
_autodiscover._tcp SRV 0 0 443 autodiscover.<our-domain>.com.
www A <customer ip>
编辑:
我检查了日志,当我在 SLAVE 上运行
rndc reload时
,系统日志针对不同的区域填满了以下内容:
Jun 19 13:54:22 ns2 named[3558]: zone <domain.com>/IN: Transfer started.
Jun 19 13:54:22 ns2 named[3558]: transfer of '<domain.com>/IN' from <MASTER IP>#53: connected using <OTHER IP, maybe FW?>#41569
Jun 19 13:54:22 ns2 named[3558]: transfer of '<domain.com>/IN' from <MASTER IP>#53: failed while receiving responses: NOTAUTH
Jun 19 13:54:22 ns2 named[3558]: transfer of '<domain.com>/IN' from <MASTER IP>#53: Transfer completed: 0 messages, 0 records, 0 bytes, 0.001 secs (0 bytes/sec)
Jun 19 13:53:49 ns2 named[3558]: zone <DOMAIN.COM>/IN: refresh: unexpected rcode (REFUSED) from master <MASTER IP>#53 (source 0.0.0.0#0)
Jun 19 13:53:49 ns2 named[3558]: zone <DOMAIN.COM>/IN: Transfer started.
在 MASTER 上,系统日志如下所示:
Jun 19 16:42:36 ns1 named[12833]: client <SLAVE IP>#15012: query (cache) '<domain.com>/SOA/IN' denied
Jun 19 16:42:36 ns1 named[12833]: client <SLAVE IP>#58925: zone transfer '<DOMAIN.COM>/AXFR/IN' denied
Jun 19 16:42:36 ns1 named[12833]: client <SLAVE IP>#56767: bad zone transfer request: '<DOMAIN.COM>/IN': non-authoritative zone (NOTAUTH)
所有这些日志都会针对不同的域重复
答案1
我觉得问题主要出在绑定系统之外。我认为这是最重要的。
Jun 19 13:54:22 ns2 named[3558]: transfer of '<domain.com>/IN' from <MASTER IP>#53: connected using <OTHER IP, maybe FW?>#41569
总体而言,通信似乎正常(从属设备可以联系主设备),但不知何故无法直接联系(例如某些 NAT)。结果是主设备看到请求来自允许的 IP 以外的 IP,并正确拒绝传输。作为简单区域传输的有效解决方案(通知可能是其他主题),我认为使用 TSIG 进行传输,这样即使请求来自从属 IP 以外的 IP,也可以正确处理,因为通过事务 SIGnature 可以正确授权...
要生成 TSIG 密钥,您可以使用例如
a=$(dnssec-keygen -a HMAC-MD5 -b 512 -n HOST transfer); sed "s/\([^ ]*\)\. IN KEY [0-9]* [0-9]* [0-9]* \([^ ]*\) \([^ ]*\)/key \1 {\n algorithm HMAC-MD5;\n secret \2\3;\n};/" ${a}.key; rm ${a}*
或者如果您更喜欢其他形式以获得更好的可读性:
a=$(dnssec-keygen -a HMAC-MD5 -b 512 -n HOST transfer)
sed "s/\([^ ]*\)\. IN KEY [0-9]* [0-9]* [0-9]* \([^ ]*\) \([^ ]*\)/key \1 {\n algorithm HMAC-MD5;\n secret \2\3;\n};/" ${a}.key
rm ${a}*
结果是可以复制出来的文本以绑定配置:
key transfer {
algorithm HMAC-MD5;
secret bv2uLjmxx2RA9DGTP697E17//s6xxt9DgjFxYpVv53qvsHdqG3Fy8IXva/OaEaHHHVuquh23mCIIQ2Gf3ojqzw==;
};
这个“块”必须复制到主配置和从配置中才能被知道并且相同;-)。
然后你可以从 MASTER 端更改配置
allow-transfer {
<IP-OF-SLAVE>;
};
到
allow-transfer {
key transfer;
};
而在奴隶方面
masters {
<IP-MASTER>;
};
到
masters {
<IP-MASTER> key transfer;
};
这样,从设备将使用密钥联系主设备,即使源 IP 发生变化,交易也将根据正确的 TSIG 获得允许。传输许可将不是基于请求的源 IP 而是基于 TSIG 的“传输”密钥来设置的。
下一步可能是调查为什么源 IP 发生变化,但此时传输已经可以正常工作 ;-)。祝你好运!
-- 编辑 -- 我在密钥部分添加了忘记的分号。加载过程中的错误消息可能会很清晰,但要完成它…… :-)
答案2
您从从属设备和主设备发布的 SOA 标头显示相同的序列号。
当主从复制“突然停止工作”时,典型的“新手”错误是,当他们在主服务器上的区域文件中进行修改时,它们不会更新 SOA 序列号记录。
当您没有增加主服务器上的 SOA 序列号记录时,从属服务器无法检测到它不同步,并且不会从主名称服务器请求区域传输。
在修改区域文件后,主服务器上的绑定将需要重新加载区域文件,以使任何更改生效,例如:rnds reload example.com
如果仍然不起作用:@wurtel 的评论为进一步调试提供了一些很好的指示。
SOA 序列字符串 1373899259 看起来像一个时间戳:
date --date=" 1970-01-01 00:00:00 UTC +1373899259 seconds"
Mon Jul 15 16:40:59 CEST 2013
并且应该至少增加到+1
1373899260,尽管可以说使用当前时间戳可能“更好”:
date +%s
1560943969