我根据文档设置了主从复制。我为此使用了 ansible(见下文)。
初始运行(锁定、备份、解锁、复制到从属主机、导入和启动从属)后,从属状态为
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
这告诉我:复制有效。
一段时间后(可能一分钟左右 - 服务器现在使用频率不高),这种情况会发生变化,SHOW SLAVE STATUS 显示
Slave_IO_Running: Yes
Slave_SQL_Running: No
Last_Error: Could not execute Delete_rows_v1 event on table nextcloud.oc_calendarobjects; Can't find record in 'oc_calendarobjects', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log master2-bin.000019, end_l
og_pos 19897439
Last_IO_Error:
Last_SQL_Error: Could not execute Delete_rows_v1 event on table nextcloud.oc_calendarobjects; Can't find record in 'oc_calendarobjects', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log master2-bin.000019, end_l
og_pos 19897439
Slave_SQL_Running_State:
这告诉我:一旦主服务器发生变化,复制就会由于某些异常而停止。
我不知道这是否是 nextcloud 数据库所特有的,因为我看到这个问题总是发生在那里,但它也是唯一一个频繁变化的数据库。
我尝试过
CHANGE MASTER TO ... MASTER_USE_GTID=no;
也
CHANGE MASTER TO ... MASTER_USE_GTID=slave_pos;
并设置--gtid
。mysqldump
行为相同。
我在这里遗漏了什么?如何才能使复制真正起作用(无需在主服务器上发生更改后重新运行所有内容,这与手动触发相同)?
Ansible 设置/执行的命令
我正在使用 ansible 设置主从复制,在所有主机上运行,但分别仅在主服务器或从服务器上执行任务。执行的 Shell 命令在最后。
我从文档中了解到,使用转储后--master-data
会mysqldump
自动刷新、锁定和解锁。
- name: Get current timestamp
run_once: true
delegate_to: localhost
ansible.builtin.command: date +%F_%T
register: datetime_run
changed_when: true
- name: Set current timestamp
ansible.builtin.set_fact:
datetime: "{{ datetime_run.stdout }}"
- name: Backup all databases for replication
when: inventory_hostname == mariadb_master_host
community.mysql.mysql_db:
login_user: admin
login_password: "{{ passwd }}"
config_file: ""
state: dump
master_data: 1
dump_extra_args: "--apply-slave-statements"
# --gtid
name:
- db1
- db2
- nextcloud
target: "/tmp/mariadb.dump.{{ datetime }}.sql.gz"
pipefail: true
- name: Copy DB dump from master
when: inventory_hostname != mariadb_master_host
ansible.builtin.synchronize:
src: "/tmp/mariadb.dump.{{ datetime }}.sql.gz"
dest: /tmp
delegate_to: "{{ mariadb_master_host }}"
- name: Stop slave
when: inventory_hostname != mariadb_master_host
community.mysql.mysql_query:
login_user: admin
login_password: "{{ passwd.stdout }}"
query: STOP SLAVE;
- name: Configure Slave
when: inventory_hostname != mariadb_master_host
# no_log: true
community.mysql.mysql_query:
login_user: admin
login_password: "{{ passwd.stdout }}"
query: |
CHANGE MASTER TO
MASTER_HOST='{{ mariadb_master_host }}',
MASTER_USER='replication',
MASTER_PASSWORD='{{ db_replication_password }}',
MASTER_PORT=3306,
MASTER_CONNECT_RETRY=10,
MASTER_USE_GTID=no;
# =slave_pos
- name: Import MariaDB Dump to start replication
when: inventory_hostname != mariadb_master_host
community.mysql.mysql_db:
login_user: admin
login_password: "{{ passwd.stdout }}"
state: import
target: "/tmp/mariadb.dump.{{ datetime }}.sql.gz"
name: all
config_file: ""
pipefail: true
use_shell: true
- name: Show SLAVE Status
when: inventory_hostname != mariadb_master_host
community.mysql.mysql_info:
login_user: admin
login_password: "{{ passwd.stdout }}"
config_file: ""
filter: "slave_status"
register: out
failed_when: out.slave_status[mariadb_master_host]['3306'].replication['Slave_IO_Running'] != 'Yes' or out.slave_status[mariadb_master_host]['3306'].replication['Slave_SQL_Running'] != 'Yes'
实际执行的mysql命令是
set -o pipefail && /usr/bin/mysqldump --user=admin --password='********' --host=localhost --port=3306 --databases db1 db2 nextcloud --quick --master-data=1 --apply-slave-statements | /usr/bin/gzip > /tmp/mariadb.dump.2023-10-17_18:10:11.sql.gz
和
/usr/bin/gzip -dc /tmp/mariadb.dump.2023-10-17_18:10:11.sql.gz | /usr/bin/mysql --user=admin --password='********' --host=localhost --port=3306
完整状态输出
掌握
File: master2-bin.000019
Position: 66117642
Binlog_Do_DB: db1,db2,nextcloud
Binlog_Ignore_DB:
奴隶
Slave_IO_State: Waiting for master to send event
Master_Host: my-master-host
Master_User: replication
Master_Port: 3306
Connect_Retry: 10
Master_Log_File: master2-bin.000019
Read_Master_Log_Pos: 66117642
Relay_Log_File: mysqld-relay-bin.000002
Relay_Log_Pos: 2068159
Relay_Master_Log_File: master2-bin.000019
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB: db1,db2,nextcloud
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1032
Last_Error: Could not execute Delete_rows_v1 event on table nextcloud.oc_calendarobjects; Can't find record in 'oc_calendarobjects', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log master2-bin.000019, end_log_pos 62885990
Skip_Counter: 0
Exec_Master_Log_Pos: 62877650
Relay_Log_Space: 4117044
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1032
Last_SQL_Error: Could not execute Delete_rows_v1 event on table nextcloud.oc_calendarobjects; Can't find record in 'oc_calendarobjects', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log master2-bin.000019, end_log_pos 62885990
Replicate_Ignore_Server_Ids:
Master_Server_Id: 2
Master_SSL_Crl:
Master_SSL_Crlpath:
Using_Gtid: No
Gtid_IO_Pos:
Replicate_Do_Domain_Ids:
Replicate_Ignore_Domain_Ids:
Parallel_Mode: optimistic
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State:
Slave_DDL_Groups: 0
Slave_Non_Transactional_Groups: 0
Slave_Transactional_Groups: 670
答案1
在从 binlog 执行该命令之前,您的数据库不同步。重复初始设置。
请确保这是作为主/从模式运行,而不是主/主模式 - 并且从属模式应设置为只读模式。