我已经为此苦思冥想了一个星期了……
设置:Ubuntu 16.04 LAMP 堆栈,php 5.6(是的,我知道我需要升级它)。该服务器为一个组织运行许多站点,其中一个站点经常被公众使用。其他站点大多是用于狭义用途的子域。大多数站点都是开源应用程序。在 AWS T2.large(2 个处理器,8GB RAM)上。存储是 SSD。单个服务器同时托管 apache 和 MySQL。
这套系统运行了好几年,效果非常好。6 月份有大约 10 天的时间,服务器负载突然猛增(超过 100),我做了一些调整,但最终通过创建新的服务器实例解决了问题。归咎于硬件性能下降(?)。
大约 10 天前,这种情况又开始发生。这种模式很有趣。虽然不是完全一致的,但负载往往每 30 分钟就会激增一次。我没有看到与此相关的 cron 作业。负载激增到 16-18 左右。有时发生这种情况时,服务器会变得没有响应(这并不奇怪),但它看起来并不像众所周知的“楔入服务器”。当它没有响应时,负载会降低,直到远低于 0.5,并保持在那里而没有恢复。基本上,它死了。apache 重新启动并不能让它恢复,但 apache stop/apache start 可以让它恢复。当它处于“死机”状态时,top 显示 mysql 使用了大约 0.2% 的 CPU。当它处于重负载时,mysql 使用了大约 2 个 CPU 核心的 30-60%(单个核心 CPU 的 60-120%)。
除了不断增加数据之外,我想不出自此以来有什么变化。“大”数据库的大小约为 1GB。
我曾尝试在六月份为服务器创建一个新的实例,但这次没有帮助。
我曾尝试进行调整,但由于需要多次重启,很难积累良好的数据。
MYSQL已经运行了一天,下面是输出tuning-primer.sh
:
-- MYSQL PERFORMANCE TUNING PRIMER --
- By: Matthew Montgomery -
MySQL Version 5.7.27-0ubuntu0.16.04.1-log x86_64
Uptime = 1 days 5 hrs 33 min 50 sec
Avg. qps = 338
Total Questions = 36005313
Threads Connected = 1
Warning: Server has not been running for at least 48hrs.
It may not be safe to use these recommendations
To find out more information on how each of these
runtime variables effects performance visit:
http://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html
Visit http://www.mysql.com/products/enterprise/advisors.html
for info about MySQL's Enterprise Monitoring and Advisory Service
SLOW QUERIES
The slow query log is NOT enabled.
Current long_query_time = 10.000000 sec.
You have 210572 out of 36005334 that take longer than 10.000000 sec. to complete
Your long_query_time seems to be fine
BINARY UPDATE LOG
The binary update log is NOT enabled.
You will not be able to do point in time recovery
See http://dev.mysql.com/doc/refman/5.7/en/point-in-time-recovery.html
WORKER THREADS
Current thread_cache_size = 16
Current threads_cached = 15
Current threads_per_sec = 0
Historic threads_per_sec = 0
Your thread_cache_size is fine
MAX CONNECTIONS
Current max_connections = 300
Current threads_connected = 1
Historic max_used_connections = 41
The number of used connections is 13% of the configured maximum.
Your max_connections variable seems to be fine.
INNODB STATUS
Current InnoDB index space = 70 M
Current InnoDB data space = 195 M
Current InnoDB buffer pool free = 25 %
Current innodb_buffer_pool_size = 384 M
Depending on how much space your innodb indexes take up it may be safe
to increase this value to up to 2 / 3 of total system memory
MEMORY USAGE
Max Memory Ever Allocated : 1.75 G
Configured Max Per-thread Buffers : 2.58 G
Configured Max Global Buffers : 1.40 G
Configured Max Memory Limit : 3.98 G
Physical Memory : 7.79 G
Max memory limit seem to be within acceptable norms
KEY BUFFER
Current MyISAM index space = 853 M
Current key_buffer_size = 1.00 G
Key cache miss rate is 1 : 16883
Key buffer free ratio = 78 %
Your key_buffer_size seems to be too high.
Perhaps you can use these resources elsewhere
QUERY CACHE
Query cache is enabled
Current query_cache_size = 16 M
Current query_cache_used = 11 M
Current query_cache_limit = 1 M
Current Query cache Memory fill ratio = 72.74 %
Current query_cache_min_res_unit = 4 K
MySQL won't cache query results that are larger than query_cache_limit in size
SORT OPERATIONS
Current sort_buffer_size = 256 K
Current read_rnd_buffer_size = 256 K
Sort buffer seems to be fine
JOINS
Current join_buffer_size = 8.00 M
You have had 6245 queries where a join could not use an index properly
join_buffer_size >= 4 M
This is not advised
You should enable "log-queries-not-using-indexes"
Then look for non indexed joins in the slow query log.
OPEN FILES LIMIT
Current open_files_limit = 3910 files
The open_files_limit should typically be set to at least 2x-3x
that of table_cache if you have heavy MyISAM usage.
Your open_files_limit value seems to be fine
TABLE CACHE
Current table_open_cache = 1800 tables
Current table_definition_cache = 2200 tables
You have a total of 2189 tables
You have 1792 open tables.
Current table_cache hit rate is 2%
, while 99% of your table cache is in use
You should probably increase your table_cache
TEMP TABLES
Current max_heap_table_size = 384 M
Current tmp_table_size = 384 M
Of 238038 temp tables, 17% were created on disk
Created disk tmp tables ratio seems fine
TABLE SCANS
Current read_buffer_size = 128 K
Current table scan ratio = 125 : 1
read_buffer_size seems to be fine
TABLE LOCKING
Current Lock Wait ratio = 1 : 52
You may benefit from selective use of InnoDB.
If you have long running SELECT's against MyISAM tables and perform
frequent updates consider setting 'low_priority_updates=1'
If you have a high concurrency of inserts on Dynamic row-length tables
consider setting 'concurrent_insert=ALWAYS'.
Apache has been running for .5 hours. That won't yield useful recommendations but does show configuration data. Here's the output of apache2buddy.pl:
#####################################################################
apache2buddy.pl report for ip-xxx-xxx-xxx-xxx.ec2.internal (xxx.xxx.xxx.xxx)
#####################################################################
[ OK ] This script is being run as root.
[ OK ] The utility 'pmap' exists and is available for use: /usr/bin/pmap
[ OK ] The utility 'netstat' exists and is available for use: /bin/netstat
[ OK ] 'php' exists and is available for use: /usr/bin/php
[ OK ] The utility 'apachectl' exists and is available for use: /usr/sbin/apachectl
[ OK ] The 'python' binary exists and is available for use: /usr/bin/python
[ OK ] The port (port 80) is a valid port.
[ -- ] We are attempting to discover the operating system type and version number ...
[ -- ] Distro: Ubuntu
[ -- ] Version: 16.04
[ -- ] Codename: xenial
[ OK ] This distro is supported by apache2buddy.pl.
[ OK ] This distro version is supported by apache2buddy.pl.
[ -- ] Hostname: ip-xxx-xxx-xxx-xxx.ec2.internal
[ -- ] Primary IP: xxx.xxx.xxx.xxx
[ -- ] We are checking the service running on port 80...
[ -- ] The process listening on port 80 is /usr/sbin/apache2
[ -- ] The process running on port 80 is Apache/2.4.18 (Ubuntu).
[ -- ] The full path to the Apache config file is: /etc/apache2/apache2.conf
[ -- ] Apache is using prefork model.
[ -- ] pidfile setting is /var/run/apache2/apache2$SUFFIX.pid.
[ -- ] Actual pidfile is /var/run/apache2/apache2.pid.
[ -- ] Parent PID: 14833.
[ OK ] Memory usage of parent PID is less than 50MB: 9712 Kilobytes.
[ -- ] Apache has been running 0d 0h 08m 55s.
[ !! ] *** LOW UPTIME ***.
[ @@ ] The following recommendations may be misleading - apache has been restarted within the last 24 hours.
[ -- ] Your server has 7981 MB of PHYSICAL memory.
[ >> ] ServerLimit directive not found, assuming default values.
[ -- ] Your ServerLimit setting is 256.
[ -- ] Your MaxRequestWorkers setting is 19.
[ OK ] Current Apache Process Count is 16, including the parent PID.
[ -- ] Number of vhosts detected: 30.
[ -- ] |________ of which 23 are HTTP (specifically, port 80).
[ -- ] |________ of which 7 are HTTPS (specifically, port 443).
[ @@ ] Current Apache vHost Count is greater than maxrequestworkers, which is unusual, but can be valid in some scenarios.
[ -- ] Your MaxRequestsPerChild setting is 5000.
[ -- ] This server is NOT running Plesk.
[ -- ] This server is NOT running cPanel.
[ -- ] Virtualmin Version: 6.07.gpl
[ -- ] Webmin Version: 1.930
Use of uninitialized value $real_config in concatenation (.) or string at -
line 1212 (#1)
(W uninitialized) An undefined value was used as if it were already
defined. It was interpreted as a "" or a 0, but maybe it was a mistake.
To suppress this warning assign a defined value to your variables.
To help you figure out what was undefined, perl will try to tell you
the name of the variable (if any) that was undefined. In some cases
it cannot do this, so it also tells you what operation you used the
undefined value in. Note, however, that perl optimizes your program
and the operation displayed in the warning may not necessarily appear
literally in your program. For example, "that $foo" is usually
optimized into "that " . $foo, and the warning will refer to the
concatenation (.) operator, even though there is no . in
your program.
Use of uninitialized value $apache_proc_php in concatenation (.) or string at -
line 2366 (#1)
[ -- ] Your PHP Memory Limit (Per-Process) is .
Use of uninitialized value $apache_proc_php in string eq at - line 2367 (#1)
[ -- ] MySQL Detected => Using 1554.09 MB of memory.
[ OK ] No large log files were found in /var/log/apache2.
[ >> ] MaxClients has been hit recently (maximum of 5 results shown), consider the dates and times below:
[Tue Sep 17 02:01:26.796315 2019] [mpm_prefork:error] [pid 3161] AH00161: server reached MaxRequestWorkers setting, consider raising the MaxRequestWorkers setting
[Tue Sep 17 02:35:48.069254 2019] [mpm_prefork:error] [pid 6394] AH00161: server reached MaxRequestWorkers setting, consider raising the MaxRequestWorkers setting
[Tue Sep 17 02:56:25.712864 2019] [mpm_prefork:error] [pid 8685] AH00161: server reached MaxRequestWorkers setting, consider raising the MaxRequestWorkers setting
[Tue Sep 17 03:32:59.609792 2019] [mpm_prefork:error] [pid 12435] AH00161: server reached MaxRequestWorkers setting, consider raising the MaxRequestWorkers setting
[Tue Sep 17 03:52:05.692572 2019] [mpm_prefork:error] [pid 14833] AH00161: server reached MaxRequestWorkers setting, consider raising the MaxRequestWorkers setting
[ OK ] No PHP Fatal Errors were found.
Use of uninitialized value $_ in pattern match (m//) at - line 2172 (#1)
[ -- ] apache2 is currently using 932.67 MB of memory.
[ -- ] The smallest apache process is using 3.35 MB of memory
[ -- ] The average apache process is using 22.50 MB of memory
[ -- ] The largest apache process is using 29.73 MB of memory
[ OK ] Going by the average Apache process, Apache can potentially use 427.50 MB RAM:
Without considering services: 5.36 % of total installed RAM
Considering extra services: 6.66 % of remaining RAM
[ OK ] Going by the largest Apache process, Apache can potentially use 564.88 MB RAM:
Without considering services: 7.08 % of total installed RAM
Considering extra services: 8.79 % of remaining RAM
--------------------------------------------------------------------------------
### GENERAL FINDINGS & RECOMMENDATIONS ###
--------------------------------------------------------------------------------
Apache2buddy.pl report for server: ip-xxx-xxx-xxx-xxx.ec2.internal (xxx.xxx.xxx.xxx):
Settings considered for this report:
[ !! ] *** LOW UPTIME ***.
[ @@ ] The following recommendations may be misleading - apache has been restarted within the last 24 hours.
Your server's physical RAM: 7981 MB
Remaining Memory after other services considered: 6426 MB
Apache's MaxRequestWorkers directive: 19 <--------- Current Setting
Apache MPM Model: prefork
Largest Apache process (by memory): 29 MB
[ !! ] Vhost count exceeds MaxRequestWorkers limits!
[ !! ] Your MaxRequestWorkers setting is too low.
Your recommended MaxRequestWorkers setting is between 194 and 216. <------- Acceptable Range (10% of MAX)
Max potential memory usage: 564 MB
Percentage of TOTAL RAM allocated to Apache: 7.08 %
Percentage of REMAINING RAM allocated to Apache: 8.79 %
--------------------------------------------------------------------------------
A log file entry has been made in: /var/log/apache2buddy.log for future reference.
Last 5 entries:
2019/09/14 18:49:19 Uptime: "0d 01h 16m 17s" Model: "Prefork" Memory: "7981 MB" MaxRequestWorkers: "45" Recommended: "96" Smallest: "3.32 MB" Avg: "31.03 MB" Largest: "72.06 MB" Highest Pct Remaining RAM: "46.70%" (40.64% TOTAL RAM)
2019/09/15 01:53:51 Uptime: "0d 05h 02m 19s" Model: "Prefork" Memory: "7981 MB" MaxRequestWorkers: "45" Recommended: "71" Smallest: "3.32 MB" Avg: "53.25 MB" Largest: "96.15 MB" Highest Pct Remaining RAM: "63.15%" (54.22% TOTAL RAM)
2019/09/15 22:24:37 Uptime: "0d 0h 25m 22s" Model: "Prefork" Memory: "7981 MB" MaxRequestWorkers: "19" Recommended: "150" Smallest: "3.32 MB" Avg: "21.83 MB" Largest: "49.91 MB" Highest Pct Remaining RAM: "12.61%" (11.89% TOTAL RAM)
2019/09/16 05:47:19 Uptime: "0d 04h 24m 25s" Model: "Prefork" Memory: "7981 MB" MaxRequestWorkers: "19" Recommended: "96" Smallest: "8.30 MB" Avg: "43.31 MB" Largest: "69.24 MB" Highest Pct Remaining RAM: "19.68%" (16.49% TOTAL RAM)
2019/09/17 04:00:53 Uptime: "0d 0h 08m 55s" Model: "Prefork" Memory: "7981 MB" MaxRequestWorkers: "19" Recommended: "216" Smallest: "3.35 MB" Avg: "22.50 MB" Largest: "29.73 MB" Highest Pct Remaining RAM: "8.79%" (7.08% TOTAL RAM)