我的 MySQL 服务器由于(页面文件写入错误)而崩溃。我做了磁盘诊断,但没有发现错误。我重新启动了 MySQL 服务器。它扫描了 bin 日志大约 1-2 小时,并记录了以下日志:
InnoDB: Doing recovery: scanned up to log sequence number 51 2341175808
InnoDB: Doing recovery: scanned up to log sequence number 51 2346418688
InnoDB: Doing recovery: scanned up to log sequence number 51 2351661568
InnoDB: Doing recovery: scanned up to log sequence number 51 2356904448
InnoDB: Doing recovery: scanned up to log sequence number 51 2362147328
之后,我看到很多错误,如下所示:
InnoDB: Number of pending reads 128, pending pread calls 0
InnoDB: Error: InnoDB has waited for 50 seconds for pending
InnoDB: reads to the buffer pool to be finished.
InnoDB: Number of pending reads 128, pending pread calls 0
InnoDB: Error: InnoDB has waited for 50 seconds for pending
InnoDB: reads to the buffer pool to be finished.
InnoDB: Number of pending reads 128, pending pread calls 0
InnoDB: Error: InnoDB has waited for 50 seconds for pending
InnoDB: reads to the buffer pool to be finished.
我已经等了几个小时,但它只是重复将这些行添加到错误日志中,而没有停止的迹象。
上面的日志消息是什么意思?我怎样才能让我的 MySQL 服务器再次运行?我在这个服务器上运行着一个 80GB 的 InnoDB 数据库。有没有办法强制恢复它而不撤消任何未提交的事务,或者尝试从导致崩溃的页面文件中恢复数据?我可以毫无问题地删除这些数据。
我尝试“sudo -u mysql /usr/sbin/mysqld –innodb_force_recovery=6”来重新启动 MySQL 服务器,但出现了如下新的重复错误:
InnoDB: stored checksum 2440779633, prior-to-4.0.14-form stored checksum 3425185587
InnoDB: Page lsn 51 2450779673, low 4 bytes of lsn at page end 2450779673
InnoDB: Page number (if stored to page already) 10824,
InnoDB: space id (if created with >= MySQL-4.1.1 and stored already) 0
InnoDB: Page may be an index page where index id is 4294967295 0
InnoDB: (index "CLUST_IND" of table "SYS_IBUF_TABLE_0")
InnoDB: Database page corruption on disk or a failed
InnoDB: file read of page 10824.
InnoDB: You may have to recover from a backup.
InnoDB: It is also possible that your operating
InnoDB: system has corrupted its own file cache
InnoDB: and rebooting your computer removes the
InnoDB: error.
InnoDB: If the corrupt page is an index page
InnoDB: you can also try to fix the corruption
InnoDB: by dumping, dropping, and reimporting
InnoDB: the corrupt table. You can use CHECK
InnoDB: TABLE to scan your table for corruption.
InnoDB: See also http://dev.mysql.com/doc/refman/5.1/en/forcing-recovery.html
InnoDB: about forcing recovery.
InnoDB: Ending processing because of a corrupt database page.
我现在应该怎么做?
答案1
如果没有导致崩溃的错误、MySQL 版本和 my.cnf,就很难确定问题到底是什么......话虽如此,这里有一些通用的建议......
您可以通过多种方式找到包含损坏页面的表。最简单的方法是关闭服务器,然后运行创新校验和针对数据库中的所有表进行检查。如果发现任何问题,您可以在恢复模式通过设置innodb_force_recovery并尝试运行表的 SELECT INTO OUTFILE 以转储内容,然后 LOAD DATA FROM INFILE 将其加载到新表中。将 innodb_force_recovery 设置为 1,如果在转储表时发生崩溃,请继续增加 innodb_force_recovery 值,直到您可以转储数据而不会崩溃。确保在执行此操作时没有客户端正在连接。
佩科纳还有Percona InnoDB 数据恢复工具这可能会最大限度地减少停机时间,但它们需要一些专业知识才能使用,并且可能会使事情变得更加混乱。此外,实际上你能在实时数据库上运行 innochecksum,但您可能会得到损坏页面的误报。在这种情况下,您可以关闭服务器,并只对实时返回页面错误的表执行 innochecksum。在实时服务器上,成功的 innochecksum 表示表没有问题,而失败的 innochecksum 可能不准确。
当您遇到这种情况时,我建议创建该数据库的从属数据库,以便如果再次发生这种情况,您可以轻松地开始使用从属数据库,而不必担心复杂的恢复过程。
答案2
我遇到了类似的错误,阅读完之后https://dba.stackexchange.com/questions/24477/innodb-corruption我决定测试一下我的 RAM。结果发现它有缺陷……
我发现运行 memtester 无需重启机器,而且得到很多:
FAILURE: 0x657423ee6593084d != 0x84e423ee6593084d at offset 0x16f17b1a0.
FAILURE: 0x58c620a5a8e4984f != 0x783620a5a8e4984f at offset 0x16f3571a0.
FAILURE: 0x73eba2598aaaa228 != 0x935ba2598aaaa228 at offset 0x16f3db1a0.
因此,如果您在稳定版本的 mysql 上遇到此错误,并且重新启动可以解决问题,则很可能是 RAM 问题。如果您注意错误消息,它会显示:
InnoDB: It is also possible that your operating
InnoDB: system has corrupted its own file cache
InnoDB: and rebooting your computer removes the
InnoDB: error.
但也许您和我一样,不愿意相信这可能是硬件问题。