恢复后 Galera Node 无法启动

恢复后 Galera Node 无法启动

我有一个 Galera “集群”,其中有一个工作节点。我添加了第二个节点并通过 rsync 启动同步。状态传输成功,但在启动数据库时,它告诉我 innodb 应该已损坏。但据我了解,Galera 的数据应该与节点 1 上的数据完全相同。

以下是日志输出:

    Feb 19 17:05:53 node2.example.com rsyncd[27285]: connect from node1.example.com (1.2.3.4)
Feb 19 17:05:53 node2.example.com rsyncd[27285]: rsync to rsync_sst/ from node1.example.com (1.2.3.4)
Feb 19 17:05:53 node2.example.com rsyncd[27285]: receiving file list
Feb 19 17:05:53 node2.example.com mysqld[38851]: 2017-02-19 17:05:53 140273959036672 [Note] WSREP: 1.0 (node1): State transfer to 0.0 (node2) complete.
Feb 19 17:05:53 node2.example.com mysqld[38851]: 2017-02-19 17:05:53 140273959036672 [Note] WSREP: Member 1.0 (node1) synced with group.
Feb 19 17:05:53 node2.example.com mysqld[38851]: WSREP_SST: [INFO] Joiner cleanup. rsync PID: 38894 (20170219 17:05:53.972)
Feb 19 17:05:54 node2.example.com mysqld[38851]: WSREP_SST: [INFO] Joiner cleanup done. (20170219 17:05:54.481)
Feb 19 17:05:54 node2.example.com mysqld[38851]: 2017-02-19 17:05:54 140274267547840 [Note] WSREP: SST complete, seqno: 2744639
Feb 19 17:05:54 node2.example.com mysqld[38851]: 2017-02-19 17:05:54 7f9425e368c0 InnoDB: Warning: Using innodb_locks_unsafe_for_binlog is DEPRECATED. This option may be removed in future     releases. Please use READ COMMITTED transaction isolation level instead, see http://dev.mysql.com/doc/refman/5.6/en/set-transaction.html.
Feb 19 17:05:54 node2.example.com mysqld[38851]: 2017-02-19 17:05:54 140274267547840 [Note] InnoDB: Using mutexes to ref count buffer pool pages
Feb 19 17:05:54 node2.example.com mysqld[38851]: 2017-02-19 17:05:54 140274267547840 [Note] InnoDB: The InnoDB memory heap is disabled
Feb 19 17:05:54 node2.example.com mysqld[38851]: 2017-02-19 17:05:54 140274267547840 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
Feb 19 17:05:54 node2.example.com mysqld[38851]: 2017-02-19 17:05:54 140274267547840 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
Feb 19 17:05:54 node2.example.com mysqld[38851]: 2017-02-19 17:05:54 140274267547840 [Note] InnoDB: Compressed tables use zlib 1.2.7
Feb 19 17:05:54 node2.example.com mysqld[38851]: 2017-02-19 17:05:54 140274267547840 [Note] InnoDB: Using Linux native AIO
Feb 19 17:05:54 node2.example.com mysqld[38851]: 2017-02-19 17:05:54 140274267547840 [Note] InnoDB: Using SSE crc32 instructions
Feb 19 17:05:54 node2.example.com mysqld[38851]: 2017-02-19 17:05:54 140274267547840 [Note] InnoDB: Initializing buffer pool, size = 16.0G
Feb 19 17:05:55 node2.example.com mysqld[38851]: 2017-02-19 17:05:55 140274267547840 [Note] InnoDB: Completed initialization of buffer pool
Feb 19 17:05:55 node2.example.com mysqld[38851]: 2017-02-19 17:05:55 140274267547840 [Note] InnoDB: Highest supported file format is Barracuda.
Feb 19 17:05:56 node2.example.com mysqld[38851]: 2017-02-19 17:05:56 140274267547840 [Note] InnoDB: The log sequence numbers 1017291030265 and 1017291030265 in ibdata files do not match     the log sequence number 1017291030429 in the ib_logfiles!
Feb 19 17:05:56 node2.example.com mysqld[38851]: 2017-02-19 17:05:56 140274267547840 [Note] InnoDB: Database was not shutdown normally!
Feb 19 17:05:56 node2.example.com mysqld[38851]: 2017-02-19 17:05:56 140274267547840 [Note] InnoDB: Starting crash recovery.
Feb 19 17:05:56 node2.example.com mysqld[38851]: 2017-02-19 17:05:56 140274267547840 [Note] InnoDB: Reading tablespace information from the .ibd files...
Feb 19 17:05:56 node2.example.com mysqld[38851]: 2017-02-19 17:05:56 140274267547840 [ERROR] InnoDB: Attempted to open a previously opened tablespace. Previous tablespace mysql/    innodb_index_stats uses space ID: 2 at filepath: ./mysql/innodb_index_stats.ibd. Cannot open tablespace tde/SOURCES which uses space ID: 2 at filepath: ./tde/SOURCES.ibd
Feb 19 17:05:56 node2.example.com mysqld[38851]: 2017-02-19 17:05:56 7f9425e368c0  InnoDB: Operating system error number 2 in a file operation.
Feb 19 17:05:56 node2.example.com mysqld[38851]: InnoDB: The error means the system cannot find the path specified.
Feb 19 17:05:56 node2.example.com mysqld[38851]: InnoDB: If you are installing InnoDB, remember that you must create
Feb 19 17:05:56 node2.example.com mysqld[38851]: InnoDB: directories yourself, InnoDB does not create them.
Feb 19 17:05:56 node2.example.com mysqld[38851]: InnoDB: Error: could not open single-table tablespace file ./tde/SOURCES.ibd
Feb 19 17:05:56 node2.example.com mysqld[38851]: InnoDB: We do not continue the crash recovery, because the table may become
Feb 19 17:05:56 node2.example.com mysqld[38851]: InnoDB: corrupt if we cannot apply the log records in the InnoDB log to it.
Feb 19 17:05:56 node2.example.com mysqld[38851]: InnoDB: To fix the problem and start mysqld:
Feb 19 17:05:56 node2.example.com mysqld[38851]: InnoDB: 1) If there is a permission problem in the file and mysqld cannot
Feb 19 17:05:56 node2.example.com mysqld[38851]: InnoDB: open the file, you should modify the permissions.
Feb 19 17:05:56 node2.example.com mysqld[38851]: InnoDB: 2) If the table is not needed, or you can restore it from a backup,
Feb 19 17:05:56 node2.example.com mysqld[38851]: InnoDB: then you can remove the .ibd file, and InnoDB will do a normal
Feb 19 17:05:56 node2.example.com mysqld[38851]: InnoDB: crash recovery and ignore that table.
Feb 19 17:05:56 node2.example.com mysqld[38851]: InnoDB: 3) If the file system or the disk is broken, and you cannot remove
Feb 19 17:05:56 node2.example.com mysqld[38851]: InnoDB: the .ibd file, you can set innodb_force_recovery > 0 in my.cnf
Feb 19 17:05:56 node2.example.com mysqld[38851]: InnoDB: and force InnoDB to continue crash recovery here.
Feb 19 17:05:56 node2.example.com mysqld[38851]: 170219 17:05:56 [ERROR] mysqld got signal 6 ;
Feb 19 17:05:56 node2.example.com mysqld[38851]: This could be because you hit a bug. It is also possible that this binary
Feb 19 17:05:56 node2.example.com mysqld[38851]: or one of the libraries it was linked against is corrupt, improperly built,
Feb 19 17:05:56 node2.example.com mysqld[38851]: or misconfigured. This error can also be caused by malfunctioning hardware.
Feb 19 17:05:56 node2.example.com mysqld[38851]: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
Feb 19 17:05:56 node2.example.com mysqld[38851]: We will try our best to scrape up some info that will hopefully help
Feb 19 17:05:56 node2.example.com mysqld[38851]: diagnose the problem, but since we have already crashed,
Feb 19 17:05:56 node2.example.com mysqld[38851]: something is definitely wrong and this may fail.
Feb 19 17:05:56 node2.example.com mysqld[38851]: Server version: 10.1.21-MariaDB
Feb 19 17:05:56 node2.example.com mysqld[38851]: key_buffer_size=134217728
Feb 19 17:05:56 node2.example.com mysqld[38851]: read_buffer_size=131072
Feb 19 17:05:56 node2.example.com mysqld[38851]: max_used_connections=0
Feb 19 17:05:56 node2.example.com mysqld[38851]: max_threads=153
Feb 19 17:05:56 node2.example.com mysqld[38851]: thread_count=2
Feb 19 17:05:56 node2.example.com mysqld[38851]: It is possible that mysqld could use up to
Feb 19 17:05:56 node2.example.com mysqld[38851]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 467121 K  bytes of memory
Feb 19 17:05:56 node2.example.com mysqld[38851]: Hope that's ok; if not, decrease some variables in the equation.
Feb 19 17:05:56 node2.example.com mysqld[38851]: Thread pointer: 0x0
Feb 19 17:05:56 node2.example.com mysqld[38851]: Attempting backtrace. You can use the following information to find out
Feb 19 17:05:56 node2.example.com mysqld[38851]: where mysqld died. If you see no messages after this, something went
Feb 19 17:05:56 node2.example.com mysqld[38851]: terribly wrong...
Feb 19 17:05:56 node2.example.com mysqld[38851]: stack_bottom = 0x0 thread_stack 0x48400
Feb 19 17:05:56 node2.example.com mysqld[38851]: /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x7f94269089ce]
Feb 19 17:05:56 node2.example.com mysqld[38851]: /usr/sbin/mysqld(handle_fatal_signal+0x305)[0x7f942642e355]
Feb 19 17:05:56 node2.example.com mysqld[38851]: /lib64/libpthread.so.0(+0xf100)[0x7f9425a48100]
Feb 19 17:05:56 node2.example.com mysqld[38851]: /lib64/libc.so.6(gsignal+0x37)[0x7f9423da35f7]
Feb 19 17:05:56 node2.example.com mysqld[38851]: /lib64/libc.so.6(abort+0x148)[0x7f9423da4ce8]
Feb 19 17:05:56 node2.example.com mysqld[38851]: /usr/sbin/mysqld(+0x8ef240)[0x7f9426768240]
Feb 19 17:05:56 node2.example.com mysqld[38851]: /usr/sbin/mysqld(+0x789948)[0x7f9426602948]
Feb 19 17:05:56 node2.example.com mysqld[38851]: /usr/sbin/mysqld(+0x819b59)[0x7f9426692b59]
Feb 19 17:05:56 node2.example.com mysqld[38851]: /usr/sbin/mysqld(+0x7393f4)[0x7f94265b23f4]
Feb 19 17:05:56 node2.example.com mysqld[38851]: /usr/sbin/mysqld(_Z24ha_initialize_handlertonP13st_plugin_int+0x64)[0x7f9426430714]
Feb 19 17:05:56 node2.example.com mysqld[38851]: /usr/sbin/mysqld(+0x445625)[0x7f94262be625]
Feb 19 17:05:56 node2.example.com mysqld[38851]: /usr/sbin/mysqld(_Z11plugin_initPiPPci+0x9ea)[0x7f94262bff3a]
Feb 19 17:05:56 node2.example.com mysqld[38851]: /usr/sbin/mysqld(+0x39ed18)[0x7f9426217d18]
Feb 19 17:05:56 node2.example.com mysqld[38851]: /usr/sbin/mysqld(_Z11mysqld_mainiPPc+0x9f5)[0x7f942621be95]
Feb 19 17:05:56 node2.example.com mysqld[38851]: /lib64/libc.so.6(__libc_start_main+0xf5)[0x7f9423d8fb15]
Feb 19 17:05:56 node2.example.com mysqld[38851]: /usr/sbin/mysqld(+0x3967ed)[0x7f942620f7ed]
Feb 19 17:05:56 node2.example.com mysqld[38851]: The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
Feb 19 17:05:56 node2.example.com mysqld[38851]: information that should help you find out what is causing the crash.

我还看到:两台服务器上都没有 ./tde/SOURCES.idb 文件。

答案1

抱歉,但我感到很困惑。您有 3 个节点,分别称为 node-1、node-2、node-3。

根据您的信息,节点 1 已启动并正在运行,您正在尝试添加节点 2。(您共享的日志是针对节点 1 的吗?节点 1 是否也面临问题?)


另外,您使用了哪种 SST 方法进行传输。


如果可能的话,您也可以共享完整的日志文件,因为这很令人困惑并且不希望发生。

2 月 19 日 17:05:56 node1.example.com mysqld[38851]: 2017-02-19 17:05:56 140274267547840 [错误] InnoDB:尝试打开先前打开的表空间。先前的表空间 mysql/innodb_index_stats 使用空间 ID:2,位于文件路径:./mysql/innodb_index_stats.ibd。无法打开表空间 tde/SOURCES,其使用空间 ID:2,位于文件路径:./tde/SOURCES.ibd

给定的 space-id 被分配给由 MySQL 打开的其他表,并且发布 SST 时相同的 space-id 被分配给其他表。

答案2

你应该更清楚地回答你的问题。根据你的问题这里,但上面没有提到,您有 1 个节点似乎正在工作,并且正在添加第二个节点。您在第二个节点上收到此错误。

当我第一次浏览您的错误日志时,我突然意识到了这一点,错误显示您从节点 2 上的损坏的数据库开始。

2 月 19 日 17:05:56 node2.example.com mysqld[38851]: 2017-02-19 17:05:56 140274267547840 [注意] InnoDB:数据库未正常关闭!

您的所有表都是 InnoDB 且带有主键吗?

我希望这对您来说只是测试设置,而不是生产设置。因此,我建议清除节点 2 上的数据目录并尝试重新启动服务。这将确保 MySQL 重新启动时干净启动,而 Xtrabackup 为 SST 所做的就是如此。我不记得 rsync 破坏了整个数据目录,因此这应该可以消除由于 rsync 未删除可能导致您出现问题的任何内容而导致的任何问题。

您能分享节点 1 和 2 的配置吗?至少是 wsrep_* 相关的配置。

相关内容