MariaDB 10.3.3 Galera 集群在现有集群中添加新节点 # WSREP:等待 SST 完成

MariaDB 10.3.3 Galera 集群在现有集群中添加新节点 # WSREP:等待 SST 完成

我正在尝试在现有的 MariaDB Master-Master 设置中添加新的 galera 集群节点,但我发现 SST 等待的时间更长,并且 mariadb 服务无法启动。服务状态输出如下,

2019-03-06  4:48:24 0 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 5522445)
2019-03-06  4:48:24 2 [Note] WSREP: State transfer required:
        Group state: 72542dfe-1f6a-11e9-8491-637f579e0d43:5522445
        Local state: 00000000-0000-0000-0000-000000000000:-1
2019-03-06  4:48:24 2 [Note] WSREP: New cluster view: global state: 72542dfe-1f6a-11e9-8491-637f579e0d43:5522445, view# 66: Primary, number of nodes: 3, my index: 0, protocol version 3
2019-03-06  4:48:24 2 [Warning] WSREP: Gap in state sequence. Need state transfer.
2019-03-06  4:48:24 2 [Note] WSREP: Setting wsrep_ready to 0
2019-03-06  4:48:24 0 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner' --address '<JOINER_ID>' --datadir '/app/mariadb10310/mysql/'   --parent '17682' --binlog '/app/mariadb10310/mysql/Galera_3_bin' --binlog-index '/app/mariadb10310/mysql/Galera_3_bin_log.index''
2019-03-06  4:48:24 2 [Note] WSREP: Prepared SST request: rsync|<JOINER_ID>:4444/rsync_sst
2019-03-06  4:48:24 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2019-03-06  4:48:24 2 [Note] WSREP: REPL Protocols: 9 (4, 2)
2019-03-06  4:48:24 2 [Note] WSREP: Assign initial position for certification: 5522445, protocol version: 4
2019-03-06  4:48:24 0 [Note] WSREP: Service thread queue flushed.
2019-03-06  4:48:24 2 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (72542dfe-1f6a-11e9-8491-637f579e0d43): 1 (Operation not permitted)
         at galera/src/replicator_str.cpp:prepare_for_IST():482. IST will be unavailable.
2019-03-06  4:48:24 0 [Note] WSREP: Member 0.0 (seadb01a-u1-inf) requested state transfer from '*any*'. Selected 1.0 (maridb1bo-w1-ap)(SYNCED) as donor.
2019-03-06  4:48:24 0 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 5522445)
2019-03-06  4:48:24 2 [Note] WSREP: Requesting state transfer: success, donor: 1
2019-03-06  4:48:24 2 [Note] WSREP: GCache history reset: 00000000-0000-0000-0000-000000000000:0 -> 72542dfe-1f6a-11e9-8491-637f579e0d43:5522445
2019-03-06  4:48:26 0 [Note] WSREP: (12ae32d9, 'tcp://0.0.0.0:4567') connection to peer 12ae32d9 with addr tcp://<JOINER_ID>:4567 timed out, no messages seen in PT3S
2019-03-06  4:48:27 0 [Note] WSREP: (12ae32d9, 'tcp://0.0.0.0:4567') turning message relay requesting off
2019-03-06  4:48:34 0 [Note] WSREP: Waiting for SST to complete. current seqno: 5522141 waited 10.000000 secs.
2019-03-06  4:48:44 0 [Note] WSREP: Waiting for SST to complete. current seqno: 5522141 waited 20.000000 secs.
2019-03-06  4:48:54 0 [Note] WSREP: Waiting for SST to complete. current seqno: 5522141 waited 30.000000 secs.
2019-03-06  4:49:04 0 [Note] WSREP: Waiting for SST to complete. current seqno: 5522141 waited 40.000000 secs.
2019-03-06  4:49:14 0 [Note] WSREP: Waiting for SST to complete. current seqno: 5522141 waited 50.000000 secs.
2019-03-06  4:49:24 0 [Note] WSREP: Waiting for SST to complete. current seqno: 5522141 waited 60.000000 secs.
2019-03-06  4:49:34 0 [Note] WSREP: Waiting for SST to complete. current seqno: 5522141 waited 70.000000 secs.
2019-03-06  4:49:44 0 [Note] WSREP: Waiting for SST to complete. current seqno: 5522141 waited 80.000000 secs.
Terminated
WSREP_SST: [INFO] Joiner cleanup. rsync PID: 17745 (20190306 04:49:53.642)
2019-03-06  4:49:54 0 [Note] WSREP: Waiting for SST to complete. current seqno: 5522141 waited 90.000000 secs.
WSREP_SST: [INFO] Joiner cleanup done. (20190306 04:49:54.149)
2019-03-06  4:50:04 0 [Note] WSREP: Waiting for SST to complete. current seqno: 5522141 waited 100.000000 secs.
2019-03-06  4:50:14 0 [Note] WSREP: Waiting for SST to complete. current seqno: 5522141 waited 110.000000 secs.
2019-03-06  4:50:24 0 [Note] WSREP: Waiting for SST to complete. current seqno: 5522141 waited 120.000000 secs.
2019-03-06  4:50:34 0 [Note] WSREP: Waiting for SST to complete. current seqno: 5522141 waited 130.000000 secs.
2019-03-06  4:50:44 0 [Note] WSREP: Waiting for SST to complete. current seqno: 5522141 waited 140.000000 secs.
2019-03-06  4:50:54 0 [Note] WSREP: Waiting for SST to complete. current seqno: 5522141 waited 150.000000 secs.
2019-03-06  4:51:04 0 [Note] WSREP: Waiting for SST to complete. current seqno: 5522141 waited 160.000000 secs.
2019-03-06  4:51:14 0 [Note] WSREP: Waiting for SST to complete. current seqno: 5522141 waited 170.000000 secs.
2019-03-06  4:51:24 0 [Note] WSREP: Waiting for SST to complete. current seqno: 5522141 waited 180.000000 secs.
2019-03-06  4:51:34 0 [Note] WSREP: Waiting for SST to complete. current seqno: 5522141 waited 190.000000 secs.
2019-03-06  4:51:44 0 [Note] WSREP: Waiting for SST to complete. current seqno: 5522141 waited 200.000000 secs.
2019-03-06  4:51:54 0 [Note] WSREP: Waiting for SST to complete. current seqno: 5522141 waited 210.000000 secs.
2019-03-06  4:52:04 0 [Note] WSREP: Waiting for SST to complete. current seqno: 5522141 waited 220.000000 secs.
2019-03-06  4:52:14 0 [Note] WSREP: Waiting for SST to complete. current seqno: 5522141 waited 230.000000 secs.
2019-03-06  4:52:24 0 [Note] WSREP: Waiting for SST to complete. current seqno: 5522141 waited 240.000000 secs.
2019-03-06  4:52:24 0 [ERROR] WSREP: Process completed with error: wsrep_sst_rsync --role 'joiner' --address '<JOINER_ID>' --datadir '/app/mariadb10310/mysql/'   --parent '17682' --binlog '/app/mariadb10310/mysql/Galera_3_bin' --binlog-index '/app/mariadb10310/mysql/Galera_3_bin_log.index': 3 (No such process)
2019-03-06  4:52:24 0 [ERROR] WSREP: Failed to read uuid:seqno and wsrep_gtid_domain_id from joiner script.
2019-03-06  4:52:24 0 [ERROR] WSREP: SST failed: 3 (No such process)
2019-03-06  4:52:24 0 [ERROR] Aborting

2019-03-06  4:52:24 0 [Warning] WSREP: 1.0 (maridb1bo-w1-ap): State transfer to 0.0 (seadb01a-u1-inf) failed: -255 (Unknown error 255)
2019-03-06  4:52:24 0 [ERROR] WSREP: gcs/src/gcs_group.cpp:gcs_group_handle_join_msg():737: Will never receive state. Need to abort.

答案1

很高兴在集思广益后分享解决方案。因为新节点位于远程位置并面临 SST 问题。1. 从现有节点进行备份并在新 galera 集群节点上恢复。确保将 grastate.dat 复制到与现有内容相同的备份转储位置。2. 备份完成后,在 server.cnf 文件中进行必要的更改 3. 在新节点上启动 mariadb 服务并监视日志文件。谢谢。

相关内容