mariadb galera 集群无法在 debian wheezy 上初始化

2024-5-31 • tag-icon

我在初始化 Galera 集群时遇到了一些问题，第二个节点的启动总是失败，日志中没有错误消息。我目前有两个节点，稍后我将安装第三个节点。这是我的配置节点 1：192.168.0.21 db01节点 2：192.168.0.22 db02

两个 /etc/hosts 都是包含主机名条目的字段。我的 galera.cnf 在两个节点上都如下所示：

[mysqld]
#mysql settings
binlog_format=ROW
default-storage-engine=innodb
innodb_autoinc_lock_mode=2
query_cache_size=0
query_cache_type=0
bind-address=0.0.0.0
#galera settings
wsrep_provider=/usr/lib/galera/libgalera_smm.so
wsrep_cluster_name="my_wsrep_cluster"
wsrep_cluster_address="gcomm://192.168.0.21,192.168.0.22"
wsrep_node_address="192.168.0.21" # 192.168.0.22 on db02
wsrep_node_name="db01" # db02 on db02 server
 wsrep_sst_method=rsync
log-error=/var/log/mysql/error.log

我可以使用此命令在 db01 上轻松启动服务：service mysql start --wsrep-new-cluster

但是当我使用 service mysql start 启动 db02 时，我收到一条失败消息。并且该服务没有在 DB02 上监听端口 3306。这是我的日志，其中显示 db02 检测到集群，检测到 db01，并且需要同步，但同步似乎没有启动……

161009 13:33:02 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
161009 13:33:02 mysqld_safe WSREP: Running position recovery with --log_error='/var/lib/mysql/wsrep_recovery.7VTkrM' --pid-file='/var/lib/mysql/db02-recover.pid'
161009 13:33:02 [Note] /usr/sbin/mysqld (mysqld 5.5.52-MariaDB-1~wheezy-wsrep) starting as process 14839 ...
161009 13:33:04 mysqld_safe WSREP: Recovered position f89d319e-8e08-11e6-8b25-ebe3bbf9c45b:2
161009 13:33:04 [Note] WSREP: wsrep_start_position var submitted: 'f89d319e-8e08-11e6-8b25-ebe3bbf9c45b:2'
161009 13:33:04 [Note] /usr/sbin/mysqld (mysqld 5.5.52-MariaDB-1~wheezy-wsrep) starting as process 14890 ...
161009 13:33:04 [Note] WSREP: Read nil XID from storage engines, skipping position init
161009 13:33:04 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so'
161009 13:33:04 [Note] WSREP: wsrep_load(): Galera 25.3.17(r3619) by Codership Oy <[email protected]> loaded successfully.
161009 13:33:04 [Note] WSREP: CRC-32C: using hardware acceleration.
161009 13:33:04 [Note] WSREP: Found saved state: f89d319e-8e08-11e6-8b25-ebe3bbf9c45b:-1
161009 13:33:04 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = 192.168.0.22; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false;
161009 13:33:05 [Note] WSREP: Service thread queue flushed.
161009 13:33:05 [Note] WSREP: Assign initial position for certification: 2, protocol version: -1
161009 13:33:05 [Note] WSREP: wsrep_sst_grab()
161009 13:33:05 [Note] WSREP: Start replication
161009 13:33:05 [Note] WSREP: Setting initial position to f89d319e-8e08-11e6-8b25-ebe3bbf9c45b:2
161009 13:33:05 [Note] WSREP: protonet asio version 0
161009 13:33:05 [Note] WSREP: Using CRC-32C for message checksums.
161009 13:33:05 [Note] WSREP: backend: asio
161009 13:33:05 [Note] WSREP: gcomm thread scheduling priority set to other:0
161009 13:33:05 [Note] WSREP: restore pc from disk successfully
161009 13:33:05 [Note] WSREP: GMCast version 0
161009 13:33:05 [Note] WSREP: (def7d829, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
161009 13:33:05 [Note] WSREP: (def7d829, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
161009 13:33:05 [Note] WSREP: EVS version 0
161009 13:33:05 [Note] WSREP: gcomm: connecting to group 'my_wsrep_cluster', peer '192.168.0.21:,192.168.0.22:'
161009 13:33:05 [Note] WSREP: (def7d829, 'tcp://0.0.0.0:4567') connection established to def7d829 tcp://192.168.0.22:4567
161009 13:33:05 [Warning] WSREP: (def7d829, 'tcp://0.0.0.0:4567') address 'tcp://192.168.0.22:4567' points to own listening address, blacklisting
161009 13:33:05 [Note] WSREP: (def7d829, 'tcp://0.0.0.0:4567') connection established to def7d829 tcp://192.168.0.22:4567
161009 13:33:05 [Note] WSREP: (def7d829, 'tcp://0.0.0.0:4567') connection established to 51e5ab0b tcp://192.168.0.21:4567
161009 13:33:05 [Note] WSREP: (def7d829, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
161009 13:33:05 [Note] WSREP: declaring 51e5ab0b at tcp://192.168.0.21:4567 stable
161009 13:33:05 [Note] WSREP: re-bootstrapping prim from partitioned components
161009 13:33:05 [Note] WSREP: view(view_id(PRIM,51e5ab0b,19) memb {
        51e5ab0b,0
        def7d829,0
} joined {
} left {
} partitioned {
})
161009 13:33:05 [Note] WSREP: save pc into disk
161009 13:33:05 [Note] WSREP: clear restored view
161009 13:33:06 [Note] WSREP: gcomm: connected
161009 13:33:06 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
161009 13:33:06 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
161009 13:33:06 [Note] WSREP: Opened channel 'my_wsrep_cluster'
161009 13:33:06 [Note] WSREP: Waiting for SST to complete.
161009 13:33:06 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2
161009 13:33:06 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
161009 13:33:06 [Note] WSREP: STATE EXCHANGE: sent state msg: 2528a615-8e14-11e6-9e93-ab2afde28393
161009 13:33:06 [Note] WSREP: STATE EXCHANGE: got state msg: 2528a615-8e14-11e6-9e93-ab2afde28393 from 0 (db01)
161009 13:33:06 [Note] WSREP: STATE EXCHANGE: got state msg: 2528a615-8e14-11e6-9e93-ab2afde28393 from 1 (db02)
161009 13:33:06 [Warning] WSREP: Quorum: No node with complete state:


        Version      : 4
        Flags        : 0x3
        Protocols    : 0 / 7 / 3
        State        : NON-PRIMARY
        Desync count : 0
        Prim state   : SYNCED
        Prim UUID    : e33a5f6b-8e12-11e6-9981-7b9a76958a99
        Prim  seqno  : 2
        First seqno  : -1
        Last  seqno  : 3
        Prim JOINED  : 1
        State UUID   : 2528a615-8e14-11e6-9e93-ab2afde28393
        Group UUID   : 20836ccf-8e06-11e6-adf3-5330826fa72d
        Name         : 'db01'
        Incoming addr: '192.168.0.21:3306'

        Version      : 4
        Flags        : 00
        Protocols    : 0 / 7 / 3
        State        : NON-PRIMARY
        Desync count : 0
        Prim state   : NON-PRIMARY
        Prim UUID    : 00000000-0000-0000-0000-000000000000
        Prim  seqno  : -1
        First seqno  : -1
        Last  seqno  : 2
        Prim JOINED  : 0
        State UUID   : 2528a615-8e14-11e6-9e93-ab2afde28393
        Group UUID   : f89d319e-8e08-11e6-8b25-ebe3bbf9c45b
        Name         : 'db02'
        Incoming addr: '192.168.0.22:3306'

161009 13:33:06 [Note] WSREP: Full re-merge of primary e33a5f6b-8e12-11e6-9981-7b9a76958a99 found: 1 of 1.
161009 13:33:06 [Note] WSREP: Quorum results:
        version    = 4,
        component  = PRIMARY,
        conf_id    = 2,
        members    = 1/2 (joined/total),
        act_id     = 3,
        last_appl. = -1,
        protocols  = 0/7/3 (gcs/repl/appl),
        group UUID = 20836ccf-8e06-11e6-adf3-5330826fa72d
161009 13:33:06 [Note] WSREP: Flow-control interval: [23, 23]
161009 13:33:06 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 3)
161009 13:33:06 [Note] WSREP: State transfer required:
        Group state: 20836ccf-8e06-11e6-adf3-5330826fa72d:3
        Local state: f89d319e-8e08-11e6-8b25-ebe3bbf9c45b:2
161009 13:33:06 [Note] WSREP: New cluster view: global state: 20836ccf-8e06-11e6-adf3-5330826fa72d:3, view# 3: Primary, number of nodes: 2, my index: 1, protocol version 3
161009 13:33:06 [Warning] WSREP: Gap in state sequence. Need state transfer.
161009 13:33:06 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner' --address '192.168.0.22' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --parent '14890''
161009 13:33:08 [Note] WSREP: (def7d829, 'tcp://0.0.0.0:4567') turning message relay requesting off

所有网络通信似乎都正常。我已禁用防火墙以进行测试。也没有 selinux。

尝试启动 node2 后，我在 db01 上有以下连接：

tcp        0      0 0.0.0.0:3306            0.0.0.0:*               LISTEN      7584/mysqld
tcp        0      0 0.0.0.0:4567            0.0.0.0:*               LISTEN      7584/mysqld
tcp        0      0 192.168.0.21:4567       192.168.0.22:53335      ESTABLISHED 7584/mysqld

我不知道 rsync 服务是否应该已经在 node1 上监听，也许这就是为什么我的第二个节点无法与集群同步。

那么我错过了什么？我的配置有什么错误吗？

PS：这是我第一次尝试安装它。我按照官方手册页进行安装：https://mariadb.org/installing-mariadb-galera-cluster-on-debian-ubuntu/

答案1

您需要首先使用具有以下集群选项的任一节点来引导集群：

wsrep_cluster_address="gcomm://"

因此，使用该配置启动一个节点，然后使用上述设置启动另一个节点。然后关闭第一个节点，并将上述配置放回到第一个节点中。

答案1

相关内容