我有几个运行 Debian Stretch 的服务器,我不断遇到一个问题,即进程会遇到分段错误并停止。直到我手动运行,它才会恢复服务service apache2 restart
。我试图找出导致这种情况的原因,以便我可以保持服务器正常运行,但我一直无法做到。
该服务器正在运行两个 Wordpress 实例(一个是公共站点,另一个是用于内容目的的私人暂存站点)。两者都通过 Certbot 受到 Let's Encrypt 的保护(我之所以包括它,是因为[ssl:warn]
下面的错误日志中有)。当发生这种情况时,我们没有观察到任何内存或磁盘空间问题。这些服务器上的交换几乎从未使用过。
service apache2 status
以下是发生段错误后的输出:
# service apache2 status
● apache2.service - The Apache HTTP Server
Loaded: loaded (/lib/systemd/system/apache2.service; enabled; vendor preset: enabled)
Active: inactive (dead) since Sun 2018-07-08 15:50:24 MST; 29min ago
Process: 11833 ExecStop=/usr/sbin/apachectl stop (code=exited, status=0/SUCCESS)
Process: 11828 ExecStart=/usr/sbin/apachectl start (code=exited, status=0/SUCCESS)
Main PID: 883 (code=exited, status=0/SUCCESS)
Jul 08 15:50:24 hostname systemd[1]: Starting The Apache HTTP Server...
Jul 08 15:50:24 hostname apachectl[11828]: httpd (pid 11770) already running
Jul 08 15:50:24 hostname systemd[1]: Started The Apache HTTP Server.
以下是输出/var/log/apache2/error.log
:
[Sat Jul 07 17:04:51.693795 2018] [core:notice] [pid 29385] AH00052: child pid 18866 exit signal Segmentation fault (11)
[Sat Jul 07 17:04:51.693918 2018] [mpm_prefork:notice] [pid 29385] AH00169: caught SIGTERM, shutting down
[Sat Jul 07 17:04:52.484310 2018] [ssl:warn] [pid 19421] AH01906: bb7f602e547898d78a02b844d49c34bc.4210997990497fe5b452e5c6c4250620.acme.invalid:443:0 server certificate is a C
A certificate (BasicConstraints: CA == TRUE !?)
/page/8/
[Sat Jul 07 17:04:51.693795 2018] [core:notice] [pid 29385] AH00052: child pid 18866 exit signal Segmentation fault (11)
[Sat Jul 07 17:04:51.693918 2018] [mpm_prefork:notice] [pid 29385] AH00169: caught SIGTERM, shutting down
[Sat Jul 07 17:04:52.484310 2018] [ssl:warn] [pid 19421] AH01906: bb7f602e547898d78a02b844d49c34bc.4210997990497fe5b452e5c6c4250620.acme.invalid:443:0 server certificate is a C
A certificate (BasicConstraints: CA == TRUE !?)
[Sat Jul 07 17:04:52.495766 2018] [ssl:warn] [pid 19422] AH01906: bb7f602e547898d78a02b844d49c34bc.4210997990497fe5b452e5c6c4250620.acme.invalid:443:0 server certificate is a C
A certificate (BasicConstraints: CA == TRUE !?)
[Sat Jul 07 17:04:52.498208 2018] [mpm_prefork:notice] [pid 19422] AH00163: Apache/2.4.25 (Debian) OpenSSL/1.0.2l configured -- resuming normal operations
[Sat Jul 07 17:04:52.498230 2018] [core:notice] [pid 19422] AH00094: Command line: '/usr/sbin/apache2'
[Sat Jul 07 17:04:58.754662 2018] [mpm_prefork:notice] [pid 19422] AH00171: Graceful restart requested, doing restart
[Sat Jul 07 17:04:58.766272 2018] [mpm_prefork:notice] [pid 19422] AH00163: Apache/2.4.25 (Debian) OpenSSL/1.0.2l configured -- resuming normal operations
[Sat Jul 07 17:04:58.766290 2018] [core:notice] [pid 19422] AH00094: Command line: '/usr/sbin/apache2'
[Sat Jul 07 17:05:00.039384 2018] [mpm_prefork:notice] [pid 19422] AH00171: Graceful restart requested, doing restart
AH00112: Warning: DocumentRoot [/var/lib/letsencrypt/tls_sni_01_page/] does not exist
[Sat Jul 07 17:05:00.050665 2018] [ssl:warn] [pid 19422] AH01906: 2af61f923209309052c60f342e6a0578.4287ae6d0b1c48707d1262e562b6250a.acme.invalid:443:0 server certificate is a CA certificate (BasicConstraints: CA == TRUE !?)
[Sat Jul 07 17:05:00.051519 2018] [mpm_prefork:notice] [pid 19422] AH00163: Apache/2.4.25 (Debian) OpenSSL/1.0.2l configured -- resuming normal operations
[Sat Jul 07 17:05:00.051528 2018] [core:notice] [pid 19422] AH00094: Command line: '/usr/sbin/apache2'
[Sat Jul 07 17:05:06.063638 2018] [core:error] [pid 19422] AH00546: no record of generation 0 of exiting child 19423
[Sat Jul 07 17:05:06.420374 2018] [mpm_prefork:notice] [pid 19422] AH00171: Graceful restart requested, doing restart
[Sat Jul 07 17:05:06.431243 2018] [mpm_prefork:notice] [pid 19422] AH00163: Apache/2.4.25 (Debian) OpenSSL/1.0.2l configured -- resuming normal operations
[Sat Jul 07 17:05:06.431264 2018] [core:notice] [pid 19422] AH00094: Command line: '/usr/sbin/apache2'
[Sat Jul 07 17:05:07.965690 2018] [mpm_prefork:notice] [pid 19422] AH00171: Graceful restart requested, doing restart
[Sat Jul 07 17:05:07.976624 2018] [mpm_prefork:notice] [pid 19422] AH00163: Apache/2.4.25 (Debian) OpenSSL/1.0.2l configured -- resuming normal operations
[Sat Jul 07 17:05:07.976636 2018] [core:notice] [pid 19422] AH00094: Command line: '/usr/sbin/apache2'
[Sat Jul 07 17:05:07.977526 2018] [core:error] [pid 19422] AH00546: no record of generation 0 of exiting child 19550
[Sat Jul 07 17:05:08.211152 2018] [core:notice] [pid 19422] AH00052: child pid 19531 exit signal Segmentation fault (11)
[Sat Jul 07 17:05:08.211291 2018] [mpm_prefork:notice] [pid 19422] AH00169: caught SIGTERM, shutting down
我们有用于上述日志的以下软件和硬件(我可以提供任何其他可能有用的东西):
- apache2 2.4.25-3+deb9u4
- Debian Stretch 9.4
- 带有 FPM 的 PHP 7.0.27-0+deb9u1
- mariadb 10.1.26-0+deb9u1
- 4 个 Intel(R) Xeon(R) CPU E5-2680 v2 核心 @ 2.80GHz
- 8G内存
- 512MB 交换
- 95G固态硬盘
答案1
该问题是由 Certbot 尝试更新证书引起的。如果我运行certbot renew
,我会遇到这些错误(我已对日志进行了一些清理,以删除域和 IP):
Encountered vhost ambiguity when trying to find a vhost for domain2.com but was unable to ask for user guidance in non-interactive mode. Certbot may need vhosts to be explicitly labelled with ServerName or ServerAlias directives.
Falling back to default vhost *:443...
Waiting for verification...
Cleaning up challenges
Attempting to renew cert (domain2.com) from /etc/letsencrypt/renewal/domain2.com.conf produced an unexpected error: Failed authorization procedure. domain2.com (tls-sni-01): urn:acme:error:unauthorized :: The client lacks sufficient authorization :: Incorrect validation certificate for tls-sni-01 challenge. Requested 942e8fc859beda1b41152fddc9579a1e.feafe6d59b7b25a33c08bca3c4be00e4.acme.invalid from 0.0.0.0:443. Received 2 certificate(s), first certificate had names "www.domain.com". Skipping.
-------------------------------------------------------------------------------
Processing /etc/letsencrypt/renewal/www.domain.com.conf
-------------------------------------------------------------------------------
Cert not yet due for renewal
All renewal attempts failed. The following certs could not be renewed:
/etc/letsencrypt/live/domain2.com/fullchain.pem (failure)
之后,跑步service apache2 status
导致
● apache2.service - The Apache HTTP Server
Loaded: loaded (/lib/systemd/system/apache2.service; enabled; vendor preset: enabled)
Active: inactive (dead) since Sun 2018-07-08 18:13:00 MST; 6s ago
Process: 22401 ExecStop=/usr/sbin/apachectl stop (code=exited, status=0/SUCCESS)
Process: 22396 ExecStart=/usr/sbin/apachectl start (code=exited, status=0/SUCCESS)
Main PID: 13700 (code=exited, status=0/SUCCESS)
Jul 08 18:13:00 hostname systemd[1]: Starting The Apache HTTP Server...
Jul 08 18:13:00 hostname apachectl[22396]: httpd (pid 22323) already running
Jul 08 18:13:00 hostname systemd[1]: Started The Apache HTTP Server.
我启用了有问题的 vhost,重新启动了 Apache,然后重新运行certbot renew
,一切正常。它经常崩溃,因为 certbot 默认每天尝试更新两次。