Ansible 设置/执行的命令

2024-6-2 • tag-icon

我根据文档设置了主从复制。我为此使用了 ansible（见下文）。

初始运行（锁定、备份、解锁、复制到从属主机、导入和启动从属）后，从属状态为

Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates

这告诉我：复制有效。

一段时间后（可能一分钟左右 - 服务器现在使用频率不高），这种情况会发生变化，SHOW SLAVE STATUS 显示

Slave_IO_Running: Yes
Slave_SQL_Running: No
Last_Error: Could not execute Delete_rows_v1 event on table nextcloud.oc_calendarobjects; Can't find record in 'oc_calendarobjects', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log master2-bin.000019, end_l
og_pos 19897439
Last_IO_Error:
Last_SQL_Error: Could not execute Delete_rows_v1 event on table nextcloud.oc_calendarobjects; Can't find record in 'oc_calendarobjects', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log master2-bin.000019, end_l
og_pos 19897439
Slave_SQL_Running_State:

这告诉我：一旦主服务器发生变化，复制就会由于某些异常而停止。

我不知道这是否是 nextcloud 数据库所特有的，因为我看到这个问题总是发生在那里，但它也是唯一一个频繁变化的数据库。

我尝试过

CHANGE MASTER TO ... MASTER_USE_GTID=no;

也

CHANGE MASTER TO ... MASTER_USE_GTID=slave_pos;

并设置--gtid。mysqldump行为相同。

我在这里遗漏了什么？如何才能使复制真正起作用（无需在主服务器上发生更改后重新运行所有内容，这与手动触发相同）？

Ansible 设置/执行的命令

我正在使用 ansible 设置主从复制，在所有主机上运行，但分别仅在主服务器或从服务器上执行任务。执行的 Shell 命令在最后。

我从文档中了解到，使用转储后--master-data会mysqldump自动刷新、锁定和解锁。

- name: Get current timestamp
  run_once: true
  delegate_to: localhost
  ansible.builtin.command: date +%F_%T
  register: datetime_run
  changed_when: true
- name: Set current timestamp
  ansible.builtin.set_fact:
    datetime: "{{ datetime_run.stdout }}"

- name: Backup all databases for replication
  when: inventory_hostname == mariadb_master_host
  community.mysql.mysql_db:
    login_user: admin
    login_password: "{{ passwd }}"
    config_file: ""
    state: dump
    master_data: 1
    dump_extra_args: "--apply-slave-statements"
    # --gtid
    name:
      - db1
      - db2
      - nextcloud
    target: "/tmp/mariadb.dump.{{ datetime }}.sql.gz"
    pipefail: true

- name: Copy DB dump from master
  when: inventory_hostname != mariadb_master_host
  ansible.builtin.synchronize:
    src: "/tmp/mariadb.dump.{{ datetime }}.sql.gz"
    dest: /tmp
  delegate_to: "{{ mariadb_master_host }}"

- name: Stop slave
  when: inventory_hostname != mariadb_master_host
  community.mysql.mysql_query:
    login_user: admin
    login_password: "{{ passwd.stdout }}"
    query: STOP SLAVE;

- name: Configure Slave
  when: inventory_hostname != mariadb_master_host
  # no_log: true
  community.mysql.mysql_query:
    login_user: admin
    login_password: "{{ passwd.stdout }}"
    query: |
      CHANGE MASTER TO
        MASTER_HOST='{{ mariadb_master_host }}',
        MASTER_USER='replication',
        MASTER_PASSWORD='{{ db_replication_password }}',
        MASTER_PORT=3306,
        MASTER_CONNECT_RETRY=10,
        MASTER_USE_GTID=no;
    # =slave_pos

- name: Import MariaDB Dump to start replication
  when: inventory_hostname != mariadb_master_host
  community.mysql.mysql_db:
    login_user: admin
    login_password: "{{ passwd.stdout }}"
    state: import
    target: "/tmp/mariadb.dump.{{ datetime }}.sql.gz"
    name: all
    config_file: ""
    pipefail: true
    use_shell: true

- name: Show SLAVE Status
  when: inventory_hostname != mariadb_master_host
  community.mysql.mysql_info:
    login_user: admin
    login_password: "{{ passwd.stdout }}"
    config_file: ""
    filter: "slave_status"
  register: out
  failed_when: out.slave_status[mariadb_master_host]['3306'].replication['Slave_IO_Running'] != 'Yes' or out.slave_status[mariadb_master_host]['3306'].replication['Slave_SQL_Running'] != 'Yes'

实际执行的mysql命令是

set -o pipefail && /usr/bin/mysqldump --user=admin --password='********' --host=localhost --port=3306 --databases db1 db2 nextcloud --quick --master-data=1 --apply-slave-statements | /usr/bin/gzip > /tmp/mariadb.dump.2023-10-17_18:10:11.sql.gz

和

/usr/bin/gzip -dc /tmp/mariadb.dump.2023-10-17_18:10:11.sql.gz | /usr/bin/mysql --user=admin --password='********' --host=localhost --port=3306

完整状态输出

掌握

            File: master2-bin.000019
        Position: 66117642
    Binlog_Do_DB: db1,db2,nextcloud
Binlog_Ignore_DB:

奴隶

                Slave_IO_State: Waiting for master to send event
                   Master_Host: my-master-host
                   Master_User: replication
                   Master_Port: 3306
                 Connect_Retry: 10
               Master_Log_File: master2-bin.000019
           Read_Master_Log_Pos: 66117642
                Relay_Log_File: mysqld-relay-bin.000002
                 Relay_Log_Pos: 2068159
         Relay_Master_Log_File: master2-bin.000019
              Slave_IO_Running: Yes
             Slave_SQL_Running: No
               Replicate_Do_DB: db1,db2,nextcloud
           Replicate_Ignore_DB:
            Replicate_Do_Table:
        Replicate_Ignore_Table:
       Replicate_Wild_Do_Table:
   Replicate_Wild_Ignore_Table:
                    Last_Errno: 1032
                    Last_Error: Could not execute Delete_rows_v1 event on table nextcloud.oc_calendarobjects; Can't find record in 'oc_calendarobjects', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log master2-bin.000019, end_log_pos 62885990
                  Skip_Counter: 0
           Exec_Master_Log_Pos: 62877650
               Relay_Log_Space: 4117044
               Until_Condition: None
                Until_Log_File:
                 Until_Log_Pos: 0
            Master_SSL_Allowed: No
            Master_SSL_CA_File:
            Master_SSL_CA_Path:
               Master_SSL_Cert:
             Master_SSL_Cipher:
                Master_SSL_Key:
         Seconds_Behind_Master: NULL
 Master_SSL_Verify_Server_Cert: No
                 Last_IO_Errno: 0
                 Last_IO_Error:
                Last_SQL_Errno: 1032
                Last_SQL_Error: Could not execute Delete_rows_v1 event on table nextcloud.oc_calendarobjects; Can't find record in 'oc_calendarobjects', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log master2-bin.000019, end_log_pos 62885990
   Replicate_Ignore_Server_Ids:
              Master_Server_Id: 2
                Master_SSL_Crl:
            Master_SSL_Crlpath:
                    Using_Gtid: No
                   Gtid_IO_Pos:
       Replicate_Do_Domain_Ids:
   Replicate_Ignore_Domain_Ids:
                 Parallel_Mode: optimistic
                     SQL_Delay: 0
           SQL_Remaining_Delay: NULL
       Slave_SQL_Running_State:
              Slave_DDL_Groups: 0
Slave_Non_Transactional_Groups: 0
    Slave_Transactional_Groups: 670

答案1

在从 binlog 执行该命令之前，您的数据库不同步。重复初始设置。

请确保这是作为主/从模式运行，而不是主/主模式 - 并且从属模式应设置为只读模式。

Ansible 设置/执行的命令

完整状态输出

掌握

奴隶

答案1

相关内容