MySQL主从复制过程偶发崩溃

MySQL主从复制过程偶发崩溃

我想知道是否有人经历过这种情况并且可以对这个问题提供一些见解。

我们有一个计划 - 普通的 MySQL 主从复制设置。表是 MyISAM,主服务器可以进行相当多的读/写活动。我们使用从属实例执行完整的每日备份,以避免主服务器宕机。备份过程执行以下操作:

STOP SLAVE SQL_THREAD 
mysqlhotcopy all tables 
START SLAVE SQL_THREAD

每隔一段时间(大约每月一次),复制就会中断,并出现各种错误消息,表明查询或日志文件损坏。这是昨晚发生的一次:

mysql> show slave status \G
*************************** 1. row ***************************
             Slave_IO_State: Waiting for master to send event
                Master_Host: server8
                Master_User: nexus8
                Master_Port: 3306
              Connect_Retry: 60
            Master_Log_File: bin.000045
        Read_Master_Log_Pos: 581644327
             Relay_Log_File: relay.000086
              Relay_Log_Pos: 94131
      Relay_Master_Log_File: bin.000045
           Slave_IO_Running: Yes
          Slave_SQL_Running: No
            Replicate_Do_DB:
        Replicate_Ignore_DB:
         Replicate_Do_Table:
     Replicate_Ignore_Table:
    Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
                 Last_Errno: 1064
                 Last_Error: Error 'You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '138070603'£' at line 1' on query. Default database: 'wtsdb'. Query: 'UPDATE fill SET clearing_fee='0.0E id='138070603'£'
               Skip_Counter: 0
        Exec_Master_Log_Pos: 4164743
            Relay_Log_Space: 577574251
            Until_Condition: None
             Until_Log_File:
              Until_Log_Pos: 0
         Master_SSL_Allowed: No
         Master_SSL_CA_File:
         Master_SSL_CA_Path:
            Master_SSL_Cert:
          Master_SSL_Cipher:
             Master_SSL_Key:
      Seconds_Behind_Master: NULL

我按照以下步骤从上述错误中恢复并恢复复制:

stop slave;
change master to MASTER_LOG_POS = 4164743, MASTER_LOG_FILE = 'bin.000045';
start slave;

我们以这种方式设置了多个服务器,但它们都会偶尔因类似错误而停止复制。如果您能提供任何有关如何解决此问题的建议,我们将不胜感激。

更新:

阅读pQd 推荐的优秀文章(特别是第 3.5 节),我很可能遇到了中继日志损坏,这可能会“使语句无法解析”。显然,这种情况可能会发生,但随后会表明 MySQL 复制代码中存在错误。我无法真正重现此问题,但在我们的生产环境中,每个从属服务器每月至少会发生一次。

这里还有两个错误供参考:

*************************** 1. row *************************** 
             Slave_IO_State: Waiting for master to send event 
                Master_Host: server10
                Master_User: nexus10 
                Master_Port: 3306 
              Connect_Retry: 60 
            Master_Log_File: bin.000008 
        Read_Master_Log_Pos: 1034654844 
             Relay_Log_File: relay.001392 
              Relay_Log_Pos: 109929 
      Relay_Master_Log_File: bin.000008 
           Slave_IO_Running: Yes 
          Slave_SQL_Running: No 
            Replicate_Do_DB: 
        Replicate_Ignore_DB: 
         Replicate_Do_Table: 
     Replicate_Ignore_Table: 
    Replicate_Wild_Do_Table: 
Replicate_Wild_Ignore_Table: 
                 Last_Errno: 1064 
                 Last_Error: Error 'You have an error in your SQL syntax; check the manual that corresponds to your MySQL server versio n for the right syntax to use near '' at line 1' on query. Default database: 'avidb'. Query: 'DELETE FROM reconciliat' 
               Skip_Counter: 0 
        Exec_Master_Log_Pos: 1026067551 
            Relay_Log_Space: 8697758 
            Until_Condition: None 
             Until_Log_File: 
              Until_Log_Pos: 0 
         Master_SSL_Allowed: No 
         Master_SSL_CA_File: 
         Master_SSL_CA_Path: 
            Master_SSL_Cert: 
          Master_SSL_Cipher: 
             Master_SSL_Key: 
      Seconds_Behind_Master: NULL 



*************************** 1. row *************************** 
             Slave_IO_State: Waiting for master to send event 
                Master_Host: server8
                Master_User: nexus8 
                Master_Port: 3306 
              Connect_Retry: 60 
            Master_Log_File: bin.000019 
        Read_Master_Log_Pos: 476189428 
             Relay_Log_File: relay.000163 
              Relay_Log_Pos: 98576 
      Relay_Master_Log_File: bin.000019 
           Slave_IO_Running: Yes 
          Slave_SQL_Running: No 
            Replicate_Do_DB: 
        Replicate_Ignore_DB: 
         Replicate_Do_Table: 
     Replicate_Ignore_Table: 
    Replicate_Wild_Do_Table: 
Replicate_Wild_Ignore_Table: 
                 Last_Errno: 1064 
                 Last_Error: Error 'You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 1' on query. Default database: 'wtsdb'. Query: 'UPDATE requestlog SET id='2173589',start_time='2009-08-18 04:59:10',end_time='2009-08-18 04:59:10',ip='X.X.X.X',session='hkacn7d54',lock_wait_time='0',unmarshal_time='0.00057435035705566',unmarshaled_objects='30',data_sync_time='0',page_process_time='0',memory_used='1572864',memory_used_peak='1572864',method='GET',request='/report.php',request_parameters='
groupId = -4 
startDate = 2009-08-17 
reportType = trades 
accountId = 2409 
endDate = 2009-08-17',progress_percent='0',progress_text='' WHERE id='2173588'' 
               Skip_Counter: 0 
        Exec_Master_Log_Pos: 186172755 
            Relay_Log_Space: 290116589 
            Until_Condition: None 
             Until_Log_File: 
              Until_Log_Pos: 0 
         Master_SSL_Allowed: No 
         Master_SSL_CA_File: 
         Master_SSL_CA_Path: 
            Master_SSL_Cert: 
          Master_SSL_Cipher: 
             Master_SSL_Key: 
      Seconds_Behind_Master: NULL 

答案1

仔细检查数据库和连接上的字符集是否符合您的预期 - 如果复制因包含非 ASCII 字符的语句而受阻,则会立即引起怀疑。

查看主服务器的二进制日志,了解其记录的语句与副本服务器尝试执行的语句有何不同 - 这将为问题提供线索。

遗憾的是,您还应确保主服务器和从服务器的服务器版本相同。我最近遇到了一个奇怪的情况,其中我们尝试将非 NULL 数字列更新为 NULL,但 SQL 很糟糕。在主服务器运行的版本中,MySQL 默默地将其转换为 0,但在从服务器运行的版本中,这是一个错误。

另外,CHANGE MASTER你不应该修复副本,而应该

STOP SLAVE;
[correct and execute whatever statement is the problem]
SET GLOBAL sql_slave_skip_counter = 1;
START SLAVE;

http://dev.mysql.com/doc/refman/5.1/en/set-global-sql-slave-skip-counter.html

答案2

您的某些更新查询是否可能不是复制安全的?例如这里- 要点 3.1。

经常在主服务器和从服务器之间比较数据是个好习惯(如果你用的是 myisam 而不是 innodb 就不好了)--例如mk-表校验和马特吉特

附言:另外两个链接可以看看:12

相关内容