我在让 apache 在 corosync 集群中工作时遇到了问题。
我可能浏览了超过一百个网页和几十次 Google 搜索,但还是找不到与我的问题相匹配的答案。
root@hh1web03t ~# uname -a
Linux hh1web03t 3.10.0-693.17.1.el7.x86_64 #1 SMP Thu Jan 25 20:13:58 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
root@hh1web03t ~# more /etc/centos-release
CentOS Linux release 7.4.1708 (Core)
root@hh1web03t ~# yum list installed corosync httpd crmsh ldirectord
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
* base: mirror.ratiokontakt.de
* epel: mirror.speedpartner.de
* extras: mirror.checkdomain.de
* updates: mirror.ratiokontakt.de
Installed Packages
corosync.x86_64 2.4.0-9.el7_4.2 @updates
crmsh.noarch 3.0.0-6.2 @network_ha-clustering_Stable
httpd.x86_64 2.4.6-67.el7.centos.6 @updates
ldirectord.x86_64 3.9.6-0rc1.1.2 @network_ha-clustering_Stable
我们有4个物理IP和10个VIP。crm状态如下:
root@hh1web03t ~# crm status
Stack: corosync
Current DC: hh1web03t (version 1.1.16-12.el7_4.7-94ff4df) - partition with quorum
Last updated: Thu Mar 29 15:53:27 2018
Last change: Thu Mar 29 15:28:47 2018 by hacluster via crmd on hh1web01t
4 nodes configured
16 resources configured
Online: [ hh1web01t hh1web02t hh1web03t hh1web04t ]
Full list of resources:
pingd (ocf::pacemaker:ping): Started hh1web03t
Resource Group: gp_LVS
ldirectord (ocf::heartbeat:ldirectord): Started hh1web03t
vip_151 (ocf::heartbeat:IPaddr2): Started hh1web03t
vip_152 (ocf::heartbeat:IPaddr2): Started hh1web03t
vip_153 (ocf::heartbeat:IPaddr2): Started hh1web03t
vip_154 (ocf::heartbeat:IPaddr2): Started hh1web03t
vip_155 (ocf::heartbeat:IPaddr2): Started hh1web03t
vip_156 (ocf::heartbeat:IPaddr2): Started hh1web03t
vip_157 (ocf::heartbeat:IPaddr2): Started hh1web03t
vip_158 (ocf::heartbeat:IPaddr2): Started hh1web03t
vip_159 (ocf::heartbeat:IPaddr2): Started hh1web03t
vip_160 (ocf::heartbeat:IPaddr2): Started hh1web03t
Clone Set: cl_vip151 [vip_151_apache]
Started: [ hh1web01t hh1web02t hh1web03t hh1web04t ]
但那只是在“httpd.conf”中有一个“Listen 80”语句。一旦我将 VIP 设置为 Listen (Listen 10.49.4.151:80),httpd 的启动就会失败。
我从另一个集群得知 vip 应该在“lo”环回接口上处于待机状态,但事实并非如此。因此,我假设问题出在我的集群配置中,而不是 apache 配置中。
活动节点:
root@hh1web03t ~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
link/ether 00:50:56:aa:1d:ca brd ff:ff:ff:ff:ff:ff
inet 10.49.4.103/24 brd 10.49.4.255 scope global ens192
valid_lft forever preferred_lft forever
inet 10.49.4.151/24 brd 10.49.4.255 scope global secondary ens192:151
valid_lft forever preferred_lft forever
inet 10.49.4.152/24 brd 10.49.4.255 scope global secondary ens192:152
valid_lft forever preferred_lft forever
inet 10.49.4.153/24 brd 10.49.4.255 scope global secondary ens192:153
valid_lft forever preferred_lft forever
inet 10.49.4.154/24 brd 10.49.4.255 scope global secondary ens192:154
valid_lft forever preferred_lft forever
inet 10.49.4.155/24 brd 10.49.4.255 scope global secondary ens192:155
valid_lft forever preferred_lft forever
inet 10.49.4.156/24 brd 10.49.4.255 scope global secondary ens192:156
valid_lft forever preferred_lft forever
inet 10.49.4.157/24 brd 10.49.4.255 scope global secondary ens192:157
valid_lft forever preferred_lft forever
inet 10.49.4.158/24 brd 10.49.4.255 scope global secondary ens192:158
valid_lft forever preferred_lft forever
inet 10.49.4.159/24 brd 10.49.4.255 scope global secondary ens192:159
valid_lft forever preferred_lft forever
inet 10.49.4.160/24 brd 10.49.4.255 scope global secondary ens192:160
valid_lft forever preferred_lft forever
备用节点:
root@hh1web04t ~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 100
link/ether 00:50:56:aa:0c:a9 brd ff:ff:ff:ff:ff:ff
inet 10.49.4.104/24 brd 10.49.4.255 scope global ens192
valid_lft forever preferred_lft forever
以下是 telnet 和 netstat 显示的内容:
root@hh1web03t ~# telnet 10.49.4.151 80
Trying 10.49.4.151...
root@hh1web03t ~# telnet 10.49.4.101 80
Trying 10.49.4.101...
Connected to 10.49.4.101.
Escape character is '^]'.
^C
Connection closed by foreign host.
root@hh1web03t ~#
root@hh1web03t ~# netstat -tulpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 3470/httpd
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 1093/sshd
tcp 0 0 0.0.0.0:443 0.0.0.0:* LISTEN 3470/httpd
tcp6 0 0 :::2224 :::* LISTEN 818/ruby
tcp6 0 0 :::22 :::* LISTEN 1093/sshd
tcp6 0 0 :::6556 :::* LISTEN 1098/xinetd
udp 0 0 10.49.4.160:123 0.0.0.0:* 847/ntpd
udp 0 0 10.49.4.159:123 0.0.0.0:* 847/ntpd
udp 0 0 10.49.4.158:123 0.0.0.0:* 847/ntpd
udp 0 0 10.49.4.157:123 0.0.0.0:* 847/ntpd
udp 0 0 10.49.4.156:123 0.0.0.0:* 847/ntpd
udp 0 0 10.49.4.155:123 0.0.0.0:* 847/ntpd
udp 0 0 10.49.4.154:123 0.0.0.0:* 847/ntpd
udp 0 0 10.49.4.153:123 0.0.0.0:* 847/ntpd
udp 0 0 10.49.4.152:123 0.0.0.0:* 847/ntpd
udp 0 0 10.49.4.151:123 0.0.0.0:* 847/ntpd
udp 0 0 10.49.4.103:123 0.0.0.0:* 847/ntpd
udp 0 0 127.0.0.1:123 0.0.0.0:* 847/ntpd
udp 0 0 0.0.0.0:123 0.0.0.0:* 847/ntpd
udp 0 0 10.49.4.103:39328 0.0.0.0:* 1165/corosync
udp 0 0 10.49.4.103:47882 0.0.0.0:* 1165/corosync
udp 0 0 10.49.4.103:58219 0.0.0.0:* 1165/corosync
udp 0 0 10.49.4.103:52173 0.0.0.0:* 1165/corosync
udp 0 0 10.49.4.103:5409 0.0.0.0:* 1165/corosync
udp6 0 0 :::123 :::* 847/ntpd
root@hh1web03t ~#
我可以通过 ssh 进入 VIP,在本例中,它将带我进入主机 hh1web03t:
root@hh1web04t ~# ssh 10.49.4.151
Last login: Thu Mar 29 16:19:39 2018 from hh1web03t
root@hh1web03t ~#
以下是 crm configure show 的部分,请注意 lvs_support 设置为 true:
primitive vip_151 IPaddr2 \
params ip=10.49.4.151 cidr_netmask=24 iflabel=151 nic=ens192 lvs_support=true \
meta target-role=Started \
op monitor interval=30s
primitive vip_151_apache apache \
params httpd="/usr/sbin/httpd" options="-d /etc/httpd" configfile="/etc/httpd/vip-151/httpd.conf" \
op monitor interval=30s
我想附加 corosync.log,但这样会超出问题 30000 个字符的限制。因此,下面是记录失败的一行(不能说这是问题所在):
Mar 29 13:15:53 apache(vip_151_apache)[1897]: ERROR: AH00180: WARNING: MaxRequestWorkers of 256 exceeds ServerLimit value of 100 servers, decreasing MaxRequestWorkers to 100. To increase, please see the ServerLimit directive. (99)Cannot assign requested address: AH00072: make_sock: could not bind to address 10.49.4.151:80 no listening sockets available, shutting down AH00015: Unable to open logs
我认为我在设置中犯了一个小错误,但我肯定找不到它。
任何帮助将不胜感激!
谢谢大家。Rudy
答案1
在 corosync 集群中,Apache 不会在每个主机上运行,您需要将其绑定到 VIP 资源,以便它仅在活动主机上运行,例如,
pcs constraint colocation add vip_151_apache vip_151 INFINITY
或者如果你只使用crmsh
crm configure colocation apache_on_151 INFINITY: vip_151_apache vip_151