不同网络的 Corosync 单播问题

2024-5-28 • tag-icon

我正在使用一个设置（在我们的登台系统上），其中包含 2 个根服务器和 1 个故障转移 IP。作为软件，我们使用 corosync 和 pacemaker。Corosync 配置为通过端口 5405 进行多播通信。--> 一切正常。

现在我想在 2 个具有故障转移 IP 的根服务器上部署此系统。但是，多播通信不起作用，因为根服务器不是直接连接的；它们与路由器连接，并且位于不同的数据中心

现在我根据 udpu 支持更改了 corosync.conf（如示例所示）。我使用的是 corosync v.1.4.1

corosync.conf：

compatibility: whitetank

totem {
        version: 2
        secauth: off
        interface {
                member {
                        memberaddr: A.A.A.A
                }
                member {
                        memberaddr: B.B.B.B
                }
                ringnumber: 0
                bindnetaddr:A.A.A.A
                mcastport: 5405
        }
transport: udpu
}

logging {
    fileline: off
        to_stderr: no
        to_logfile: yes
        to_syslog: yes
        logfile: /var/log/cluster/corosync.log
        debug: off
        timestamp: on
        logger_subsys {
                subsys: AMF
                debug: off
        }
}

service {
    name: pacemaker
    ver: 1
}

如果我看一下“netstat -nlp”：

Proto Recv-Q Send-Q Local Address               Foreign Address             State       PID/Program name   
tcp        0      0 0.0.0.0:47150               0.0.0.0:*                   LISTEN      1224/rpc.statd      
tcp        0      0 0.0.0.0:111                 0.0.0.0:*                   LISTEN      1206/rpcbind        
tcp        0      0 0.0.0.0:22                  0.0.0.0:*                   LISTEN      1314/sshd           
tcp        0      0 127.0.0.1:25                0.0.0.0:*                   LISTEN      1391/master         
tcp        0      0 :::111                      :::*                        LISTEN      1206/rpcbind        
tcp        0      0 :::22                       :::*                        LISTEN      1314/sshd           
tcp        0      0 :::45791                    :::*                        LISTEN      1224/rpc.statd      
udp        0      0 0.0.0.0:59806               0.0.0.0:*                               1769/corosync       
udp        0      0 0.0.0.0:39859               0.0.0.0:*                               1769/corosync       
udp        0      0 0.0.0.0:957                 0.0.0.0:*                               1206/rpcbind        
udp        0      0 0.0.0.0:56137               0.0.0.0:*                               1224/rpc.statd      
udp        0      0 0.0.0.0:976                 0.0.0.0:*                               1224/rpc.statd      
udp        0      0 0.0.0.0:111                 0.0.0.0:*                               1206/rpcbind        
udp        0      0 :::957                      :::*                                    1206/rpcbind        
udp        0      0 :::111                      :::*                                    1206/rpcbind        
udp        0      0 :::36209                    :::*                                    1224/rpc.statd

如果我尝试在此端口上进行 telnet --> 连接被拒绝。此外，我已禁用 SELinux、防火墙等 --> 不起作用。

/var/log/cluster/corosync.log：

May 08 13:54:42 corosync [MAIN  ] Corosync Cluster Engine exiting with status 0 at main.c:1858.
May 09 12:22:18 corosync [MAIN  ] Corosync Cluster Engine ('1.4.1'): started and ready to provide service.
May 09 12:22:18 corosync [MAIN  ] Corosync built-in features: nss dbus rdma snmp
May 09 12:22:18 corosync [MAIN  ] Successfully read main configuration file '/etc/corosync/corosync.conf'.
May 09 12:22:18 corosync [TOTEM ] Initializing transport (UDP/IP Unicast).
May 09 12:22:18 corosync [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
May 09 12:22:18 corosync [TOTEM ] bind token socket failed: Cannot assign requested address (99)
May 09 12:22:18 corosync [TOTEM ] The network interface [A.A.A.A] is now up.
Set r/w permissions for uid=0, gid=0 on /var/log/cluster/corosync.log
May 09 12:22:18 corosync [pcmk  ] info: process_ais_conf: Reading configure
May 09 12:22:18 corosync [pcmk  ] info: config_find_init: Local handle: 7739444317642555395 for logging
May 09 12:22:18 corosync [pcmk  ] info: config_find_next: Processing additional logging options...
May 09 12:22:18 corosync [pcmk  ] info: get_config_opt: Found 'off' for option: debug
May 09 12:22:18 corosync [pcmk  ] info: get_config_opt: Found 'yes' for option: to_logfile
May 09 12:22:18 corosync [pcmk  ] info: get_config_opt: Found '/var/log/cluster/corosync.log' for option: logfile
May 09 12:22:18 corosync [pcmk  ] info: get_config_opt: Found 'yes' for option: to_syslog
May 09 12:22:18 corosync [pcmk  ] info: get_config_opt: Defaulting to 'daemon' for option: syslog_facility
May 09 12:22:18 corosync [pcmk  ] info: config_find_init: Local handle: 5650605097994944516 for quorum
May 09 12:22:18 corosync [pcmk  ] info: config_find_next: No additional configuration supplied for: quorum
May 09 12:22:18 corosync [pcmk  ] info: get_config_opt: No default for option: provider
May 09 12:22:18 corosync [pcmk  ] info: config_find_init: Local handle: 2730409743423111173 for service
May 09 12:22:18 corosync [pcmk  ] info: config_find_next: Processing additional service options...
May 09 12:22:18 corosync [pcmk  ] info: get_config_opt: Found '1' for option: ver
May 09 12:22:18 corosync [pcmk  ] info: process_ais_conf: Enabling MCP mode: Use the Pacemaker init script to complete Pacemaker startup
May 09 12:22:18 corosync [pcmk  ] info: get_config_opt: Defaulting to 'pcmk' for option: clustername
May 09 12:22:18 corosync [pcmk  ] info: get_config_opt: Defaulting to 'no' for option: use_logd
May 09 12:22:18 corosync [pcmk  ] info: get_config_opt: Defaulting to 'no' for option: use_mgmtd
May 09 12:22:18 corosync [pcmk  ] info: pcmk_startup: CRM: Initialized
May 09 12:22:18 corosync [pcmk  ] Logging: Initialized pcmk_startup
May 09 12:22:18 corosync [pcmk  ] info: pcmk_startup: Maximum core file size is: 18446744073709551615
May 09 12:22:18 corosync [pcmk  ] info: pcmk_startup: Service: 10
May 09 12:22:18 corosync [pcmk  ] info: pcmk_startup: Local hostname: CentOS-60-64-minimal
May 09 12:22:18 corosync [pcmk  ] info: pcmk_update_nodeid: Local node id: 1632251470
May 09 12:22:18 corosync [pcmk  ] info: update_member: Creating entry for node 1632251470 born on 0
May 09 12:22:18 corosync [pcmk  ] info: update_member: 0x152edf0 Node 1632251470 now known as CentOS-60-64-minimal (was: (null))
May 09 12:22:18 corosync [pcmk  ] info: update_member: Node CentOS-60-64-minimal now has 1 quorum votes (was 0)
May 09 12:22:18 corosync [pcmk  ] info: update_member: Node 1632251470/CentOS-60-64-minimal is now: member
May 09 12:22:18 corosync [SERV  ] Service engine loaded: Pacemaker Cluster Manager 1.1.6
May 09 12:22:18 corosync [SERV  ] Service engine loaded: corosync extended virtual synchrony service
May 09 12:22:18 corosync [SERV  ] Service engine loaded: corosync configuration service
May 09 12:22:18 corosync [SERV  ] Service engine loaded: corosync cluster closed process group service v1.01
May 09 12:22:18 corosync [SERV  ] Service engine loaded: corosync cluster config database access v1.01
May 09 12:22:18 corosync [SERV  ] Service engine loaded: corosync profile loading service
May 09 12:22:18 corosync [SERV  ] Service engine loaded: corosync cluster quorum service v0.1
May 09 12:22:18 corosync [MAIN  ] Compatibility mode set to whitetank.  Using V1 and V2 of the synchronization engine.
May 09 12:22:27 corosync [TOTEM ] Totem is unable to form a cluster because of an operating system or network fault. The most common cause of this message is that the local firewall is configured improperly.
May 09 12:22:30 corosync [TOTEM ] Totem is unable to form a cluster because of an operating system or network fault. The most common cause of this message is that the local firewall is configured improperly.
May 09 12:22:32 corosync [TOTEM ] Totem is unable to form a cluster because of an operating system or network fault. The most common cause of this message is that the local firewall is configured improperly.
May 09 12:22:34 corosync [TOTEM ] Totem is unable to form a cluster because of an operating system or network fault. The most common cause of this message is that the local firewall is configured improperly.
May 09 12:22:36 corosync [TOTEM ] Totem is unable to form a cluster because of an operating system or network fault. The most common cause of this message is that the local firewall is configured improperly.

如果我尝试“crm status”，则会得到“无法连接到集群”

我有什么错？有人能帮忙吗？给点提示？

相关内容