你能帮助我找出哪个 SQL 查询导致我的 MySQL 内存耗尽吗?
因此,在控制台中运行 dmesg 命令,我得到以下结果:
[root@dmzemf httpd]# dmesg -T | grep 'Out of memory'
[Tue Apr 16 11:21:19 2019] Out of memory: Kill process 7522 (mysqld) score 185 or sacrifice child
[Tue Apr 16 11:23:26 2019] Out of memory: Kill process 11628 (mysqld) score 108 or sacrifice child
[Tue Apr 16 11:25:36 2019] Out of memory: Kill process 11847 (mysqld) score 109 or sacrifice child
[Tue Apr 16 11:25:36 2019] Out of memory: Kill process 12010 (httpd) score 106 or sacrifice child
[Tue Apr 16 11:53:14 2019] Out of memory: Kill process 12070 (mysqld) score 116 or sacrifice child
[Tue Apr 16 11:53:14 2019] Out of memory: Kill process 12008 (httpd) score 99 or sacrifice child
[Tue Apr 16 12:04:01 2019] Out of memory: Kill process 13530 (mysqld) score 127 or sacrifice child
[Tue Apr 16 12:04:01 2019] Out of memory: Kill process 13554 (mysqld) score 128 or sacrifice child
[Tue Apr 16 12:04:01 2019] Out of memory: Kill process 13563 (mysqld) score 128 or sacrifice child
[Tue Apr 16 14:01:15 2019] Out of memory: Kill process 18465 (mysqld) score 117 or sacrifice child
[Tue Apr 16 14:01:15 2019] Out of memory: Kill process 18666 (httpd) score 116 or sacrifice child
[Tue Apr 16 14:07:07 2019] Out of memory: Kill process 20734 (mysqld) score 120 or sacrifice child
[Tue Apr 16 14:07:07 2019] Out of memory: Kill process 20683 (httpd) score 75 or sacrifice child
[Tue Apr 16 14:07:11 2019] Out of memory: Kill process 21072 (httpd) score 100 or sacrifice child
[root@dmzemf httpd]#
我还检查了文件 /var/log/mariadb.log:
[root@dmzemf mariadb]# tail mariadb.log
Version: '5.5.60-MariaDB' socket: '/var/lib/mysql/mysql.sock' port: 3306 MariaDB Server
190416 14:01:13 mysqld_safe Number of processes running now: 0
190416 14:01:13 mysqld_safe mysqld restarted
190416 14:01:13 InnoDB: The InnoDB memory heap is disabled
190416 14:01:13 InnoDB: Mutexes and rw_locks use GCC atomic builtins
190416 14:01:13 InnoDB: Compressed tables use zlib 1.2.7
190416 14:01:13 InnoDB: Using Linux native AIO
190416 14:01:13 [Note] /usr/libexec/mysqld (mysqld 5.5.60-MariaDB) starting as process 20734 ...
190416 14:01:13 InnoDB: Initializing buffer pool, size = 128.0M
190416 14:01:13 InnoDB: Completed initialization of buffer pool
190416 14:01:13 InnoDB: highest supported file format is Barracuda.
InnoDB: The log sequence number in ibdata files does not match
InnoDB: the log sequence number in the ib_logfiles!
InnoDB: Restoring possible half-written data pages from the doublewrite buffer...
190416 14:01:14 InnoDB: Waiting for the background threads to start
190416 14:01:15 Percona XtraDB (http://www.percona.com) 5.5.59-MariaDB-38.11 started; log sequence number 9910582483
190416 14:01:15 [Note] Plugin 'FEEDBACK' is disabled.
190416 14:01:15 [Note] Server socket created on IP: '0.0.0.0'.
190416 14:01:15 [Note] Event Scheduler: Loaded 0 events
190416 14:01:15 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.5.60-MariaDB' socket: '/var/lib/mysql/mysql.sock' port: 3306 MariaDB Server
190416 14:07:06 mysqld_safe Number of processes running now: 0
190416 14:07:06 mysqld_safe mysqld restarted
190416 14:07:07 [Note] /usr/libexec/mysqld (mysqld 5.5.60-MariaDB) starting as process 21142 ...
190416 14:07:08 InnoDB: The InnoDB memory heap is disabled
190416 14:07:08 InnoDB: Mutexes and rw_locks use GCC atomic builtins
190416 14:07:08 InnoDB: Compressed tables use zlib 1.2.7
190416 14:07:08 InnoDB: Using Linux native AIO
190416 14:07:08 InnoDB: Initializing buffer pool, size = 128.0M
190416 14:07:08 InnoDB: Completed initialization of buffer pool
190416 14:07:08 InnoDB: highest supported file format is Barracuda.
InnoDB: The log sequence number in ibdata files does not match
InnoDB: the log sequence number in the ib_logfiles!
InnoDB: Restoring possible half-written data pages from the doublewrite buffer...
190416 14:07:10 InnoDB: Waiting for the background threads to start
190416 14:07:11 Percona XtraDB (http://www.percona.com) 5.5.59-MariaDB-38.11 started; log sequence number 9913403499
190416 14:07:11 [Note] Plugin 'FEEDBACK' is disabled.
190416 14:07:11 [Note] Server socket created on IP: '0.0.0.0'.
190416 14:07:11 [Note] Event Scheduler: Loaded 0 events
190416 14:07:11 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.5.60-MariaDB' socket: '/var/lib/mysql/mysql.sock' port: 3306 MariaDB Server
文件 /var/log/httpd/error_log 返回以下内容:
[root@dmzemf httpd]# tail error_log
[Sun Apr 14 03:17:16.513647 2019] [mpm_prefork:notice] [pid 1501] AH00163: Apache/2.4.6 (Red Hat Enterprise Linux) OpenSSL/1.0.2k-fips PHP/5.4.16 configured -- resuming normal operations
[Sun Apr 14 03:17:16.513667 2019] [core:notice] [pid 1501] AH00094: Command line: '/usr/sbin/httpd -D FOREGROUND'
[Mon Apr 15 11:26:38.754999 2019] [mpm_prefork:notice] [pid 1501] AH00170: caught SIGWINCH, shutting down gracefully
[Mon Apr 15 11:26:39.863554 2019] [core:notice] [pid 2894] SELinux policy enabled; httpd running as context system_u:system_r:httpd_t:s0
[Mon Apr 15 11:26:39.957750 2019] [mpm_prefork:notice] [pid 2894] AH00163: Apache/2.4.6 (Red Hat Enterprise Linux) OpenSSL/1.0.2k-fips PHP/5.4.16 configured -- resuming normal operations
[Mon Apr 15 11:26:39.957774 2019] [core:notice] [pid 2894] AH00094: Command line: '/usr/sbin/httpd -D FOREGROUND'
[Tue Apr 16 08:32:42.820348 2019] [mpm_prefork:notice] [pid 2894] AH00170: caught SIGWINCH, shutting down gracefully
[Tue Apr 16 08:32:43.949798 2019] [core:notice] [pid 1909] SELinux policy enabled; httpd running as context system_u:system_r:httpd_t:s0
[Tue Apr 16 08:32:44.056357 2019] [mpm_prefork:notice] [pid 1909] AH00163: Apache/2.4.6 (Red Hat Enterprise Linux) OpenSSL/1.0.2k-fips PHP/5.4.16 configured -- resuming normal operations
[Tue Apr 16 08:32:44.056391 2019] [core:notice] [pid 1909] AH00094: Command line: '/usr/sbin/httpd -D FOREGROUND'
[Tue Apr 16 13:21:09.454338 2019] [mpm_prefork:notice] [pid 1909] AH00170: caught SIGWINCH, shutting down gracefully
[Tue Apr 16 13:21:10.655994 2019] [core:notice] [pid 18170] SELinux policy enabled; httpd running as context system_u:system_r:httpd_t:s0
[Tue Apr 16 13:21:10.934372 2019] [mpm_prefork:notice] [pid 18170] AH00163: Apache/2.4.6 (Red Hat Enterprise Linux) OpenSSL/1.0.2k-fips PHP/5.4.16 configured -- resuming normal operations
[Tue Apr 16 13:21:10.934395 2019] [core:notice] [pid 18170] AH00094: Command line: '/usr/sbin/httpd -D FOREGROUND'
[root@dmzemf httpd]#
另外,也许可以帮助文件 /etc/my.cnf.d/server.cnf 的内容
#
# These groups are read by MariaDB server.
# Use it for options that only the server (but not clients) should see
#
# See the examples of server my.cnf files in /usr/share/mysql/
#
# this is read by the standalone daemon and embedded servers
[server]
# this is only for the mysqld standalone daemon
[mysqld]
max_allowed_packet = 128M
#innodb_buffer_pool_size = 16M
#innodb_log_file_size = 128M
#innodb_log_buffer_size = 8M
# this is only for embedded server
[embedded]
# This group is only read by MariaDB-5.5 servers.
# If you use the same .cnf file for MariaDB of different versions,
# use this group for options that older servers don't understand
[mysqld-5.5]
# These two groups are only read by MariaDB servers, not by MySQL.
# If you use the same .cnf file for MySQL and MariaDB,
# you can put MariaDB-only options here
[mariadb]
[mariadb-5.5]
当然,有一些 PHP 脚本会导致此问题,但我无法识别它们......您能帮帮我吗?
** 更新 **
嗯,我没有说该问题很可能是由处理超过 20 列的表的 PHP 应用程序引起的......
sar -u 1 10 命令的输出如下所示:
对于有效的页面:
[root@dmzemf ~]# sar -u 1 10
Linux 3.10.0-693.11.1.el7.x86_64 (dmzemf.domain.com) 04/18/2019 _x86_64_ (2 CPU)
09:24:20 AM CPU %user %nice %system %iowait %steal %idle
09:24:21 AM all 0.00 0.00 0.50 0.00 0.00 99.50
09:24:22 AM all 0.00 0.00 0.50 0.00 0.00 99.50
09:24:23 AM all 0.50 0.00 0.00 0.00 0.00 99.50
09:24:24 AM all 0.00 0.00 0.00 0.00 0.00 100.00
09:24:25 AM all 24.00 0.00 3.50 0.00 0.00 72.50
09:24:26 AM all 46.73 0.00 4.02 0.50 0.00 48.74
09:24:27 AM all 1.01 0.00 1.51 0.00 0.00 97.49
09:24:28 AM all 1.50 0.00 0.50 0.00 0.00 98.00
09:24:29 AM all 0.00 0.00 0.50 0.00 0.00 99.50
09:24:30 AM all 0.00 0.00 0.00 0.00 0.00 100.00
Average: all 7.37 0.00 1.10 0.05 0.00 91.48
[root@dmzemf ~]#
对于出现问题的页面:
[root@dmzemf ~]# sar -u 1 10
Linux 3.10.0-693.11.1.el7.x86_64 (dmzemf.domain.com) 04/18/2019 _x86_64_ (2 CPU)
09:26:03 AM CPU %user %nice %system %iowait %steal %idle
09:26:04 AM all 0.00 0.00 0.00 0.00 0.00 100.00
09:26:05 AM all 46.27 0.00 4.48 0.00 0.00 49.25
09:26:06 AM all 47.24 0.00 2.51 0.00 0.00 50.25
09:26:07 AM all 7.00 0.00 2.00 0.00 0.00 91.00
09:26:08 AM all 0.00 0.00 0.00 0.00 0.00 100.00
09:26:09 AM all 0.00 0.00 0.00 0.00 0.00 100.00
09:26:10 AM all 0.00 0.00 0.00 0.00 0.00 100.00
09:26:11 AM all 0.00 0.00 0.00 0.00 0.00 100.00
09:26:12 AM all 0.00 0.00 0.00 0.00 0.00 100.00
09:26:13 AM all 0.50 0.00 0.00 0.00 0.00 99.50
Average: all 10.12 0.00 0.90 0.00 0.00 88.98
[root@dmzemf ~]#
此外,top 命令的顶部如下所示:
Tasks: 180 total, 2 running, 178 sleeping, 0 stopped, 0 zombie
%Cpu(s): 27.0 us, 2.3 sy, 0.0 ni, 70.5 id, 0.0 wa, 0.0 hi, 0.2 si, 0.0 st
KiB Mem : 3882072 total, 182316 free, 1998020 used, 1701736 buff/cache
KiB Swap: 2097148 total, 1780112 free, 317036 used. 1472304 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
27141 apache 20 0 821852 473448 4424 R 49.2 12.2 0:03.45 httpd
28289 mysql 20 0 971048 131304 8272 S 10.0 3.4 0:01.06 mysqld
28531 emf 20 0 162040 2368 1592 R 0.3 0.1 0:00.09 top
1 root 20 0 201860 4860 2932 S 0.0 0.1 104:36.02 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:12.95 kthreadd
3 root 20 0 0 0 0 S 0.0 0.0 0:17.60 ksoftirqd/0
5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H
7 root rt 0 0 0 0 S 0.0 0.0 0:02.86 migration/0
8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh
9 root 20 0 0 0 0 S 0.0 0.0 118:06.00 rcu_sched
10 root rt 0 0 0 0 S 0.0 0.0 2:47.77 watchdog/0
11 root rt 0 0 0 0 S 0.0 0.0 3:00.40 watchdog/1
12 root rt 0 0 0 0 S 0.0 0.0 0:02.30 migration/1
13 root 20 0 0 0 0 S 0.0 0.0 0:20.28 ksoftirqd/1
15 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/1:0H
17 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kdevtmpfs
我希望这可以为我们提供线索,告诉我们在这种情况下我们能做什么......提前谢谢你!
答案1
解决这个问题的方法是设置内存分配限制,使得满载系统不能占用机器的所有内存。
问题是由于您的软件试图分配的内存超过可用内存量而引起的。如果您为数据库设置了最大内存,而数据库的使用率达到 100%,则数据库的速度会变慢,但不会导致系统出现 OOM。Apache 也是如此。
答案2
我认为你在服务器上没有足够的资源 http 和 mysql 有最高的 oom 分数,要检查哪些查询花费了很长时间和资源,请采取显示完整流程列表每 5 分钟输出一次,这将有助于识别查询。对于 http oom,请检查当时的连接数,因为我不确定您的工作负载,请也检查您的 mpm,从日志中我可以看到您正在使用 prefork,这可能不是您这种工作负载的最佳选择,请检查。