我遇到了一个令人困惑的问题,mysqld 偶尔会处于非常尴尬的状态,即 CPU 使用率非常高(几百个线程都在争夺 CPU 时间)并且内存快速累积(更不用说泄漏)。一旦这种情况开始发生,服务器在执行查询时就会变得非常慢,通常慢到足以触发等待它的应用程序超时。
最终,它会积累足够的内存,开始触发 Linux 内核 OOM-killer,此时整个系统已经因为不断的交换而变得非常慢,几乎无法使用,需要硬重置。为用户
设置更严格的内存分配ulimit
值mysql
已经解决了这一特定症状,但潜在问题仍然存在。
一旦 MySQL 服务器处于此状态,关闭命令后需要几分钟才能真正停止,这也使得重新启动它令人紧张。尽管我仍然更喜欢尝试彻底关闭它,以避免破坏复制。
一旦服务器处于这种状态,慢查询日志就毫无用处了每一个查询会被记录为慢查询,因此很难找出任何实际的违规者。
将 mysql 服务器升级到版本5.6.43-1debian8
已将问题从每隔几天发生一次减少到每隔几周发生一次。
SHOW FULL PROCESS LIST;
在正在发生的情况中我尚未能够获得。
运行该程序的机器有 12 核/24 线程,64 GiB 内存(其中大部分内存可用于 MySQL)和 32 GiB 的 HDD 交换。
这是我my.cnf
之后提出的所有建议(不确定这些更改是否已经解决了问题,我将随着时间的推移不断判断)。
#
# The MySQL database server configuration file.
#
# You can copy this to one of:
# - "/etc/mysql/my.cnf" to set global options,
# - "~/.my.cnf" to set user-specific options.
#
# One can use all long options that the program supports.
# Run program with --help to get a list of available options and with
# --print-defaults to see which it would actually understand and use.
#
# For explanations see
# http://dev.mysql.com/doc/mysql/en/server-system-variables.html
# This will be passed to all mysql clients
# It has been reported that passwords should be enclosed with ticks/quotes
# escpecially if they contain "#" chars...
# Remember to edit /etc/mysql/debian.cnf when changing the socket location.
[client]
port = 3306
socket = /var/run/mysqld/mysqld.sock
# default-character-set=latin1
# Here is entries for some specific programs
# The following values assume you have at least 32M ram
# This was formally known as [safe_mysqld]. Both versions are currently parsed.
[mysqld_safe]
log_error = /var/log/mysql/mysql_error.log
socket = /var/run/mysqld/mysqld.sock
nice = 0
[mysqld]
#
# * Basic Settings
#
user = mysql
pid-file = /var/run/mysqld/mysqld.pid
socket = /var/run/mysqld/mysqld.sock
port = 3306
#basedir = /usr/local
#ledir = /usr/local/bin
#datadir = /var/lib/mysql
#datadir = /ssd/mysql
datadir = /mnt/vg0_lv0/mysql/
tmpdir = /tmp
#language = /usr/local/share/mysql/english
#open-files-limit = 32000
sql_mode = NO_ENGINE_SUBSTITUTION
character-set-server=latin1
collation-server=latin1_swedish_ci
net_read_timeout = 120
skip-external-locking
skip-name-resolve
#
# Instead of skip-networking the default is now to listen only on
# localhost which is more compatible and is not less secure.
bind-address = ::
#
# * Fine Tuning
#
# mysqld might use { key_buffer + (read_buffer_size + read_rnd_buffer_size + sort_buffer_size + join_buffer_size) * max_threads } bytes of memory
#max_threads = 2048 # probably bound to max_connections
key_buffer = 1536M # global limit
key_buffer_size = 1536M # not sure which spelling is correct, use both
# as said above, these limits are per connection and NOT global
read_buffer_size = 16M
read_rnd_buffer_size = 16M
sort_buffer_size = 16M
myisam_sort_buffer_size = 16M
join_buffer_size = 20M
max_allowed_packet = 128M
preload_buffer_size = 384K
tmp_table_size = 192M
max_heap_table_size = 192M
thread_stack = 256K
thread_cache_size = 24
# This replaces the startup script and checks MyISAM tables if needed
# the first time they are touched
myisam-recover = BACKUP
max_connections = 500
table_open_cache = 3200
table_open_cache_instances = 2 # approx. half of cores used routinely
#table_definition_cache # automatically set to min(400+(table_open_cache/2),2000)
#thread_concurrency = 48
#
# * Query Cache Configuration
#
# Defaults: 1M, 16M
#query_cache_limit = 12M
#query_cache_size = 2048M
#query_cache_type = 1
#query_cache_min_res_unit = 32K
# Query Cache disabled due to lock contention <https://www.percona.com/blog/2012/09/05/write-contentions-on-the-query-cache/>
query_cache_size = 0
query_cache_type = 0
# query_prealloc_size = 64M
wait_timeout = 3600
interactive_timeout = 28800
#
# * Logging and Replication
#
# Both location gets rotated by the cronjob.
# Be aware that this log type is a performance killer.
# As of 5.1 you can enable the log at runtime!
#general_log_file = /var/log/mysql/mysql.log
#general_log = 1
log_error = /var/log/mysql/mysql_error.log
#
# Error logging goes to syslog due to /etc/mysql/conf.d/mysqld_safe_syslog.cnf.
#
# Here you can see queries with especially long duration
slow_query_log = 1
slow_query_log_file = /var/log/mysql/mysql-slow.log
long_query_time = 10
#log-queries-not-using-indexes = 1 # logs all queries that filter without an index, not just bad JOINs (spammy)
#log_slow_queries = /var/log/mysql/mysql-slow.log # old, invalid
#long_query_time = 2
#log-queries-not-using-indexes = 1
#
# The following can be used as easy to replay backup logs or for replication.
# note: if you are setting up a replication slave, see README.Debian about
# other settings you may need to change.
server-id = 3
log_bin = /mnt/vg0_lv0/mysql_binlog/mysql-bin.log
expire_logs_days = 31
max_binlog_size = 1073741824
#log_bin = /var/log/mysql/mysql-bin.log
#expire_logs_days = 31
#max_binlog_size = 8192M
#binlog_do_db = include_database_name
#binlog_ignore_db = include_database_name
#slave-net-timeout = 30
#master-retry-count = 99999
#slave-skip-errors=1062,1053
#slave-skip-errors=1062,1053,1050,1146,1051,1396,1060,1054
#
# * InnoDB
#
# InnoDB is enabled by default with a 10MB datafile in /var/lib/mysql/.
# Read the manual for more InnoDB related options. There are many!
innodb_buffer_pool_size = 6144M
innodb_buffer_pool_instances = 6 # only >1 if each instance gets at least 1GiB, i_buffer_pool_size will be divided by this
innodb_log_file_size = 768M
innodb_file_per_table = on
innodb_ft_cache_size = 32M
innodb_ft_total_cache_size = 640M
innodb_additional_mem_pool = 32M
innodb_thread_concurrency = 0 # 0=autodetect
innodb_read_io_threads = 64 # 64=max
innodb_write_io_threads = 64
innodb_log_buffer_size = 8M
# * Security Features
#
# Read the manual, too, if you want chroot!
# chroot = /var/lib/mysql/
#
# For generating SSL certificates I recommend the OpenSSL GUI "tinyca".
#
# ssl-ca=/etc/mysql/cacert.pem
# ssl-cert=/etc/mysql/server-cert.pem
# ssl-key=/etc/mysql/server-key.pem
[mysqldump]
quick
quote-names
max_allowed_packet = 32M
[mysql]
#no-auto-rehash # faster start of mysql but no tab completition
[isamchk]
key_buffer = 16M
#
# * IMPORTANT: Additional settings that can override those from this file!
# The files must end with '.cnf', otherwise they'll be ignored.
#
!includedir /etc/mysql/conf.d/
根据此处的建议(已添加的评论已剪辑)得出的 mysqltuner 报告:
>> MySQLTuner 1.7.13 - Major Hayden <[email protected]>
>> Bug reports, feature requests, and downloads at http://mysqltuner.com/
>> Run with '--help' for additional options and output filtering
[--] Skipped version check for MySQLTuner script
[OK] Logged in using credentials passed on the command line
[OK] Currently running supported MySQL version 5.6.44-log
[OK] Operating on 64-bit architecture
-------- Log file Recommendations ------------------------------------------------------------------
[--] Log file: /var/log/mysql/mysql_error.log(2M)
[OK] Log file /var/log/mysql/mysql_error.log exists
[OK] Log file /var/log/mysql/mysql_error.log is readable.
[OK] Log file /var/log/mysql/mysql_error.log is not empty
[OK] Log file /var/log/mysql/mysql_error.log is smaller than 32 Mb
[!!] /var/log/mysql/mysql_error.log contains 2285 warning(s).
- Almost all of these are replication warnings about RAND()
[!!] /var/log/mysql/mysql_error.log contains 27 error(s).
- Almost all of these are "Logging to '/var/log/mysql/mysql_error.log'." and similar.
[--] 18 start(s) detected in /var/log/mysql/mysql_error.log
[--] 1) 2019-05-09 20:27:44 34546 [Note] /usr/sbin/mysqld: ready for connections.
[--] 2) 2019-05-09 20:23:44 8281 [Note] /usr/sbin/mysqld: ready for connections.
[--] 3) 2019-05-09 20:22:38 39057 [Note] /usr/sbin/mysqld: ready for connections.
[--] 4) 2019-05-09 20:01:56 43432 [Note] /usr/sbin/mysqld: ready for connections.
[--] 5) 2019-05-09 19:55:54 22899 [Note] /usr/sbin/mysqld: ready for connections.
[--] 6) 2019-05-09 19:48:51 33825 [Note] /usr/sbin/mysqld: ready for connections.
[--] 7) 2019-05-09 19:26:38 20046 [Note] /usr/sbin/mysqld: ready for connections.
[--] 8) 2019-05-01 14:28:48 1915 [Note] /usr/sbin/mysqld: ready for connections.
[--] 9) 2019-04-15 19:50:40 9309 [Note] /usr/sbin/mysqld: ready for connections.
[--] 10) 2019-04-15 19:40:28 43854 [Note] /usr/sbin/mysqld: ready for connections.
[--] 16 shutdown(s) detected in /var/log/mysql/mysql_error.log
[--] 1) 2019-05-09 20:27:20 8281 [Note] /usr/sbin/mysqld: Shutdown complete
[--] 2) 2019-05-09 20:23:20 39057 [Note] /usr/sbin/mysqld: Shutdown complete
[--] 3) 2019-05-09 20:22:12 43432 [Note] /usr/sbin/mysqld: Shutdown complete
[--] 4) 2019-05-09 20:01:32 22899 [Note] /usr/sbin/mysqld: Shutdown complete
[--] 5) 2019-05-09 19:55:29 33825 [Note] /usr/sbin/mysqld: Shutdown complete
[--] 6) 2019-05-09 19:48:21 20046 [Note] /usr/sbin/mysqld: Shutdown complete
[--] 7) 2019-05-09 19:26:06 1915 [Note] /usr/sbin/mysqld: Shutdown complete
[--] 8) 2019-05-01 14:28:01 9309 [Note] /usr/sbin/mysqld: Shutdown complete
[--] 9) 2019-04-15 19:50:16 43854 [Note] /usr/sbin/mysqld: Shutdown complete
[--] 10) 2019-04-15 19:40:02 7435 [Note] /usr/sbin/mysqld: Shutdown complete
-------- Storage Engine Statistics -----------------------------------------------------------------
[--] Status: +ARCHIVE +BLACKHOLE +CSV -FEDERATED +InnoDB +MEMORY +MRG_MYISAM +MyISAM +PERFORMANCE_SCHEMA
[--] Data in InnoDB tables: 25.4G (Tables: 9193)
[--] Data in MyISAM tables: 6.1G (Tables: 4007)
[OK] Total fragmented tables: 0
-------- Analysis Performance Metrics --------------------------------------------------------------
[--] innodb_stats_on_metadata: OFF
[OK] No stat updates during querying INFORMATION_SCHEMA.
-------- Security Recommendations ------------------------------------------------------------------
[OK] There are no anonymous accounts for any database users
[OK] All database users have passwords assigned
[!!] User '*****@%' has user name as password.
- A few more like these, removed for privacy.
[!!] User '*****@%' does not specify hostname restrictions.
- Several dozen more like these, removed for privacy. Restrictions are implemented via iptables.
[--] There are 618 basic passwords in the list.
[!!] User '*****@localhost' is using weak password: 123456 in a lower, upper or capitalize derivative version.
[!!] User '*****@%' is using weak password: test in a lower, upper or capitalize derivative version.
-------- CVE Security Recommendations --------------------------------------------------------------
[OK] NO SECURITY CVE FOUND FOR YOUR VERSION
-------- Performance Metrics -----------------------------------------------------------------------
[--] Up for: 2d 21h 36m 2s (112M q [450.481 qps], 1M conn, TX: 225G, RX: 21G)
[--] Reads / Writes: 95% / 5%
[--] Binary logging is enabled (GTID MODE: OFF)
[--] Physical Memory : 63.0G
[--] Max MySQL memory : 41.5G
[--] Other process memory: 4.9G
[--] Total buffers: 7.7G global + 68.2M per thread (500 max threads)
[--] P_S Max memory usage: 487M
[--] Galera GCache Max memory usage: 0B
[OK] Maximum reached memory usage: 11.6G (18.42% of installed RAM)
[OK] Maximum possible memory usage: 41.5G (65.92% of installed RAM)
[OK] Overall possible memory usage with other process is compatible with memory available
[OK] Slow queries: 0% (64K/112M)
[OK] Highest usage of available connections: 10% (51/500)
[OK] Aborted connections: 0.16% (2506/1587046)
[OK] Query cache is disabled by default due to mutex contention on multiprocessor machines.
[OK] Sorts requiring temporary tables: 0% (0 temp sorts / 4M sorts)
[!!] Joins performed without indexes: 1703245
[OK] Temporary tables created on disk: 6% (998K on disk / 15M total)
[OK] Thread cache hit rate: 99% (64 created / 1M connections)
[!!] Table cache hit rate: 0% (3K open / 405K opened)
[OK] Open file limit used: 4% (1K/24K)
[OK] Table locks acquired immediately: 99% (142M immediate / 142M locks)
[OK] Binlog cache memory access: 99.99% (4902582 Memory / 4902885 Total)
-------- Performance schema ------------------------------------------------------------------------
[--] Memory used by P_S: 487.9M
[--] Sys schema isn't installed.
-------- ThreadPool Metrics ------------------------------------------------------------------------
[--] ThreadPool stat is disabled.
-------- MyISAM Metrics ----------------------------------------------------------------------------
[!!] Key buffer used: 22.7% (365M used / 1B cache)
[OK] Key buffer size / total MyISAM indexes: 1.5G/1.4G
[OK] Read Key buffer hit rate: 97.7% (1B cached / 37M reads)
[!!] Write Key buffer hit rate: 15.9% (454K cached / 72K writes)
-------- InnoDB Metrics ----------------------------------------------------------------------------
[--] InnoDB is enabled.
[--] InnoDB Thread Concurrency: 0
[OK] InnoDB File per table is activated
[!!] InnoDB buffer pool / data size: 6.0G/25.4G
[OK] Ratio InnoDB log file size / InnoDB Buffer pool size: 768.0M * 2/6.0G should be equal 25%
[OK] InnoDB buffer pool instances: 6
[--] InnoDB Buffer Pool Chunk Size not used or defined in your version
[OK] InnoDB Read buffer efficiency: 100.00% (180155088749 hits/ 180158103035 total)
[!!] InnoDB Write Log efficiency: 37.14% (922821 hits/ 2484433 total)
[OK] InnoDB log waits: 0.00% (0 waits / 1561612 writes)
-------- AriaDB Metrics ----------------------------------------------------------------------------
[--] AriaDB is disabled.
-------- TokuDB Metrics ----------------------------------------------------------------------------
[--] TokuDB is disabled.
-------- XtraDB Metrics ----------------------------------------------------------------------------
[--] XtraDB is disabled.
-------- Galera Metrics ----------------------------------------------------------------------------
[--] Galera is disabled.
-------- Replication Metrics -----------------------------------------------------------------------
[--] Galera Synchronous replication: NO
[--] This server is acting as master for 1 server(s).
[--] Binlog format: STATEMENT
[--] XA support enabled: ON
[--] Semi synchronous replication Master: Not Activated
[--] Semi synchronous replication Slave: Not Activated
[--] No replication setup for this server or replication not started.
-------- Recommendations ---------------------------------------------------------------------------
General recommendations:
Control warning line(s) into /var/log/mysql/mysql_error.log file
Control error line(s) into /var/log/mysql/mysql_error.log file
Set up a Secure Password for user@host ( SET PASSWORD FOR 'user'@'SpecificDNSorIp' = PASSWORD('secure_password'); )
Restrict Host for user@% to user@SpecificDNSorIp
4 user(s) used basic or weak password.
Adjust your join queries to always utilize indexes
Increase table_open_cache gradually to avoid file descriptor limits
Read this before increasing table_open_cache over 64: [snip - bit.ly not allowed]
Read this before increasing for MariaDB https://mariadb.com/kb/en/library/optimizing-table_open_cache/
This is MyISAM only table_cache scalability problem, InnoDB not affected.
See more details here: https://bugs.mysql.com/bug.php?id=49177
This bug already fixed in MySQL 5.7.9 and newer MySQL versions.
Beware that open_files_limit (24000) variable
should be greater than table_open_cache (3200)
Consider installing Sys schema from https://github.com/mysql/mysql-sys
Variables to adjust:
join_buffer_size (> 20.0M, or always use indexes with JOINs)
table_open_cache (> 3200)
innodb_buffer_pool_size (>= 25.4G) if possible.
我可以采取什么步骤来尝试调试此问题虽然目前还没有发生? 我可以采取什么步骤来尝试解决该问题?
答案1
每秒速率 = RPS - 针对 my.cnf [mysqld] 部分的建议
key_cache_age_threshold=7200 # from 300 seconds to reduce key_reads RPS of 150
max_connections=128 # from 500 because max_used_connections < 60
innodb_io_capacity=1000 # from 200 to allow higher IOPS
read_rnd_buffer_size=256K # from 16M to reduce handler_read_rnd_next RPS of 36978
table_definition_cache=10000 # from 2000 to reduce opened_table_definitions RPHr of 6,340