我已经设置了一个邮件服务器,用于测试。我的目标是拥有一个带有 imaps 的 HA 邮件服务器,当客户端连接到虚拟 IP 时,它会重定向到两个真实服务器,如果一个真实服务器崩溃,另一个真实服务器将“接管”连接。我已经设置了一个集群,其中包含两个 keepalived/haproxy lb 和两个带有 postfix 和 Dovecot 的真实服务器。这两个 lb 是 Debian,邮件服务器是 Fedora 31。这是我在两个 lb(负载平衡器)上的配置
Keepalived配置文件
global_defs {
}
vrrp_instance VI_1 {
interface nm-team
state MASTER
virtual_router_id 51
priority 101 # 101 on master, 100 on backup
advert_int 1
smtp_alert
authentication {
auth_type PASS
auth_pass mypass
}
}
virtual_ipaddress {
10.2.0.4/24 brd 10.2.0.255 dev nm-team
}
virtual_server 10.2.0.4 25 {
delay_loop 30
lb_algo rr
lb_kind DR
protocol TCP
persistence_timeout 360
real_server 10.2.0.5 25 {
weight 1
TCP_CHECK {
connect_timeout 10
connect_port 25
delay_before_retry 3
}
}
real_server 10.2.0.6 25 {
weight 1
TCP_CHECK {
connect_timeout 10
connect_port 25
delay_before_retry 3
}
}
}
virtual_server 10.2.0.4 993 {
delay_loop 30
lb_algo rr
lb_kind DR
protocol TCP
persistence_timeout 360
real_server 10.2.0.5 993 {
weight 1
TCP_CHECK {
connect_timeout 10
connect_port 993
nb_get_retry 3
delay_before_retry 3
}
}
real_server 10.2.0.6 993 {
weight 1
TCP_CHECK {
connect_timeout 10
connect_port 993
nb_get_retry 3
delay_before_retry 3
}
}
}
haproxy配置文件
global
log /dev/log local0
log /dev/log local1 notice
chroot /var/lib/haproxy
stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
stats timeout 30s
user haproxy
group haproxy
daemon
# Default SSL material locations
ca-base /etc/ssl/certs
crt-base /etc/ssl/private
# Default ciphers to use on SSL-enabled listening sockets.
# For more information, see ciphers(1SSL). This list is from:
# https://hynek.me/articles/hardening-your-web-servers-ssl-ciphers/
# An alternative list with additional directives can be obtained from
# https://mozilla.github.io/server-side-tls/ssl-config-generator/?server=haproxy
ssl-default-bind-ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:RSA+AESGCM:RSA+AES:!aNULL:!MD5:!DSS
ssl-default-bind-options no-sslv3
defaults
log global
mode tcp
#postfix
listen smtp
bind mail.mydomain.priv:25
balance roundrobin
timeout client 30s
timeout connect 10s
timeout server 1m
no option http-server-close
mode tcp
option smtpchk
option tcplog
server mail1 mail1.mydomain.priv:25 send-proxy
server mail2 mail2.mydomain.priv:25 send-proxy
#dovecot
listen imap
bind mail.mydomain.priv:993
timeout client 30s
timeout connect 10s
timeout server 1m
no option http-server-close
balance leastconn
stick store-request src
stick-table type ip size 200k expire 30m
mode tcp
option tcplog
server mail1 mail1.mydomain.priv:993 send-proxy
server mail2 mail2.mydomain.priv:993 send-proxy
如您所见,mail.domain.priv 是绑定到虚拟 IP 10.2.0.4(由 keepalived 创建)的“虚拟”服务器,真实服务器是 10.2.0.5 和 10.2.0.6。虚拟 IP 10.2.0.4 是 lo 接口的别名,我在 lb 中使用这些行创建了它
ip addr add 10.2.0.4/32 dev lo label lo:0
在真实服务器中
echo 1 >/proc/sys/net/ipv4/conf/all/arp_ignore
echo 2 >/proc/sys/net/ipv4/conf/all/arp_announce
ip addr add 10.2.0.4/32 dev lo label lo:0
我跳过了发布 dovecot/postfix 配置,因为它太长了,但我已经测试过了,作为单个服务器,使用 10.2.0.4 虚拟 ip,运行良好。当然,真实服务器使用 glusterfs 共享 /var/vmail/mydomain(我知道很慢,但仅用于测试)。我已连接客户端,我可以使用 dovecot 接收电子邮件,使用 imaps 和 smtp 使用 starttls 使用 postfix 发送电子邮件,没有任何问题。那么,问题是什么?我已经测试了集群,关闭了其中一台真实服务器,打开了客户端(Thunderbird),客户端“冻结”,因为集群不存在,无法读取电子邮件。如果我关闭客户端或重新启动它,它会毫无问题地重新连接到 10.2.0.4 虚拟 ip(mail.mydomain.priv)。哪里出了问题?是否可以使用 keepalived 和 haproxy 创建主动/主动 ha 集群?
答案1
找到了解决方案,感谢 unix 论坛的帮助:从 lo:0 中删除虚拟 ip,并仅在 haproxy/keepalived 服务器上创建 nm-team:0 别名。
然后我编辑 haproxy.cfg
global
log /dev/log local0
log /dev/log local1 notice
chroot /var/lib/haproxy
stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
stats timeout 30s
user haproxy
group haproxy
daemon
# Default SSL material locations
ca-base /etc/ssl/certs
crt-base /etc/ssl/private
# Default ciphers to use on SSL-enabled listening sockets.
# For more information, see ciphers(1SSL). This list is from:
# https://hynek.me/articles/hardening-your-web-servers-ssl-ciphers/
# An alternative list with additional directives can be obtained from
# https://mozilla.github.io/server-side-tls/ssl-config-generator/?server=haproxy
ssl-default-bind-ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:RSA+AESGCM:RSA+AES:!aNULL:!MD5:!DSS
ssl-default-bind-options no-sslv3
defaults
log global
mode tcp
option dontlognull
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout http-keep-alive 10s
timeout check 10s
maxconn 3000
frontend mail-in
bind mail.mydomain.priv:25
mode tcp
option tcplog
default_backend mail-in-back
backend mail-in-back
balance roundrobin
server mail1.mydomain.priv mail1.mydomain.priv:25 check
server mail2.mydomain.priv mail2.mydomain.priv:25 check
frontend imaps-in
bind mail.mydomain.priv:993
mode tcp
option tcplog
default_backend imaps-in-back
backend imaps-in-back
balance roundrobin
server mail1.mydomain.priv mail1.mydomain.priv:993 check
server mail2.mydomain.priv mail2.mydomain.priv:993 check
然后我编辑 keepalived.conf
vrrp_script chk_haproxy {
script "killall -0 haproxy" # check the haproxy process
interval 2 # every 2 seconds
weight 2 # add 2 points if OK
}
vrrp_instance VI_1 {
interface nm-team # interface to monitor
state MASTER # MASTER on haproxy1, BACKUP on haproxy2
virtual_router_id 51
priority 100 # 100 on haproxy1, 101 on haproxy2
advert_int 1
smtp_alert
authentication {
auth_type PASS
auth_pass yourpass
}
virtual_ipaddress {
10.2.0.4 # virtual ip address
}
track_script {
chk_haproxy
}
}
然后我将 keepalived.conf 复制到 haproxy2 上并调整一些声音(MASTER 变为 BACKUP,id 100 变为 101)。在 haproxy 服务器上,我为 sysctl 保留此配置
net.ipv4.tcp_syncookies=1
net.ipv4.ip_forward=1
net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.default.send_redirects = 0
net.ipv4.conf.team0.send_redirects = 0
net.ipv4.conf.nm-team.send_redirects = 0
重新启动 keepalived 和 haproxy 后一切正常,我测试了客户端连接,关闭了一个邮件服务器,经过 5-10 秒的不活动后,连接恢复正常,而无需重新启动 MUA。