MySQL 变得非常慢,直到重新启动

MySQL 变得非常慢,直到重新启动

我有一个在 CentOS 上运行的 MySQL 服务器。

最近我遇到了一个问题,大约每两天就会发生一次。服务器运行速度很快,运行正常,但突然变得非常慢,直到我重新启动 MySQL,它才恢复正常状态。

这种情况发生过几次,所以这次我在运行之前截了两张屏幕截图service mysqld restart

重启之前: 重启之前

重启后: 重启后

我的大多数表都是 InnoDB,少数是 MyISAM。(4 个 MyISAM 表,38 个 InnoDB 表)

我的.cnf:

[mysqld]
bulk_insert_buffer_size = 8M
concurrent_insert = 2
connect_timeout = 30
default-storage-engine = MyISAM
innodb_buffer_pool_size=1300M
innodb_file_per_table=1
interactive_timeout = 1000
join_buffer_size=128M
key_buffer_size = 1200M
local-infile=0
slow_query_log=1
long_query_time=0.5
#skip-grant-tables
max_allowed_packet = 900M
max_connections = 40000
max_heap_table_size = 256M
max_user_connections = 10000
max_write_lock_count = 8
myisam_max_sort_file_size = 256M
myisam_sort_buffer_size = 64M
open_files_limit = 10192
query_alloc_block_size = 65536
query_cache_limit = 256M
query_cache_size = 384M
query_cache_type = 1
query_prealloc_size = 262144
range_alloc_block_size = 4096
read_buffer_size = 4M
read_rnd_buffer_size = 16M
sort_buffer_size = 4M
table_cache = 8048
table_open_cache = 8000
thread_cache_size = 50
tmp_table_size = 256M
transaction_alloc_block_size = 4096
transaction_prealloc_size = 4096
#innodb_force_recovery=5
wait_timeout = 1000
max_connect_errors = 5000
open-files = 50000
[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid

显示全局状态如'%connect%';

+--------------------------+--------+
| Variable_name            | Value  |
+--------------------------+--------+
| Aborted_connects         | 0      |
| Connections              | 859148 |
| Max_used_connections     | 103    |
| Ssl_client_connects      | 0      |
| Ssl_connect_renegotiates | 0      |
| Ssl_finished_connects    | 0      |
| Threads_connected        | 1      |
+--------------------------+--------+

显示类似'thread_%'的全局变量;

+---------------------------+---------------------------+
| Variable_name             | Value                     |
+---------------------------+---------------------------+
| thread_cache_size         | 50                        |
| thread_concurrency        | 10                        |
| thread_handling           | one-thread-per-connection |
| thread_pool_idle_timeout  | 60                        |
| thread_pool_max_threads   | 500                       |
| thread_pool_oversubscribe | 3                         |
| thread_pool_size          | 8                         |
| thread_pool_stall_limit   | 500                       |
| thread_stack              | 294912                    |
+---------------------------+---------------------------+

显示全局状态如'threads_%';

+-------------------+-------+
| Variable_name     | Value |
+-------------------+-------+
| Threads_cached    | 49    |
| Threads_connected | 1     |
| Threads_created   | 372   |
| Threads_running   | 1     |
+-------------------+-------+

显示类似‘key_%’的全局状态;

+------------------------+---------+
| Variable_name          | Value   |
+------------------------+---------+
| Key_blocks_not_flushed | 0       |
| Key_blocks_unused      | 1003901 |
| Key_blocks_used        | 3365    |
| Key_blocks_warm        | 0       |
| Key_read_requests      | 99176   |
| Key_reads              | 3052    |
| Key_write_requests     | 29353   |
| Key_writes             | 29347   |
+------------------------+---------+

显示全球状态如‘Q%’;

+-------------------------+-----------+
| Variable_name           | Value     |
+-------------------------+-----------+
| Qcache_free_blocks      | 961       |
| Qcache_free_memory      | 400828904 |
| Qcache_hits             | 1634009   |
| Qcache_inserts          | 1201887   |
| Qcache_lowmem_prunes    | 0         |
| Qcache_not_cached       | 59970     |
| Qcache_queries_in_cache | 1467      |
| Qcache_total_blocks     | 3926      |
| Queries                 | 5316596   |
| Questions               | 5187929   |
+-------------------------+-----------+

显示类似'_size'的全局变量;

Empty set

答案1

为了立即缓解压力,在查看参考手册后,请考虑

set global read_rnd_buffer_size=256K;  # from 16M per connection

这可以动态完成。
完成此操作后,每次登录将不需要 16M。为什么要读取 16M(即使是从 RAM 读取),而 256K 就足够了?发布其他请求项目后,我会提出更多建议。

----- 2017 11 04 ------------- 以下建议需要您在每天仅实施一项之前进行研究。有些建议可能会动态应用。建议的 cfg/ini 值遵循 [mysqld] 部分,可以修改、添加或删除。

max_connections=200 #from 40000 to support your 103 max_used_connections
max_user_connections=200 #from 10000 to be matched with max_connections
key_buffer_size  REMOVE for default of 64M.  less than 1% of 1200MB used

thread_cache_size=100 #from 50  to support your 103 max_used_connections - cap at 100 per V8
thread_concurrency=33 #from 10  for about 30% active
max_connect_errors=10 #from 5000, to better control hacker passwd guessing
innodb_print_all_deadlocks=1 # from OFF, if you ever have one, you need this data in error log
#### these are PER CONNECTION values driving your RAM footprint up the wall
#read_buffer_size  or REMOVE for default of 128K vs 4M RAM
#read_rnd_buffer_size or REMOVE for default of 256K vs 16M RAM
#join_buffer_size or REMOVE for default of 128K vs 128MB RAM 

使用 MySQLCalculator.com 将帮助您了解如果 40000 个并发连接能够成功(不太可能发生)则需要多少 RAM - 大约需要 6 TB 的 RAM。

如需对实施变更的结果进行进一步分析,请在正常运行 7 天后发布完整的文本结果

SHOW GLOBAL STATUS;
SHOW GLOBAL VARIABLES;
SHOW ENGINE INNODB STATUS;

并重新发布完整的my.cnf。

答案2

这似乎更像是客户端负载问题,而不是内存泄漏服务器问题。守护进程的线程正在占用大约一到两个核心。它们在忙什么?SHOW FULL PROCESSLIST 说了什么?

重启的作用远不止重置守护进程的状态。它摧毁了 587 个进程,这些进程可能与服务器有活动端口 3306(或 AF_UNIX)连接。它们在做什么?你对它们所做的满意吗?它们在重启时是否记录了让你不满意的致命错误?它们是否应该完成某些任务,然后断开连接并退出?

重新启动是一种快速修复,但看起来您想了解在重新启动之前的 48 小时内客户端负载如何越来越大。

相关内容