我们有以下 monit 配置,如果无法连接到 tomcat,它会重新启动它:
check host Tomcat-Foo with address localhost
stop program = "/usr/bin/systemctl stop tomcat.service"
start program = "/usr/bin/systemctl start tomcat.service" with timeout 360 seconds
if failed host localhost
port 8081
protocol http
request "/foo/"
for 3 times within 5 cycles
then alert
问题是启动需要很长时间,而且 monit 似乎一直在检查。这意味着,当 tomcat 启动时,monit 似乎认为它“再次”关闭,并启动其他重新启动,使其成为重启循环。
有没有简单的方法可以让 monit 暂停/禁用检查直到 tomcat 实际上再次重新启动?
或者,如果这个配置看起来完全不同,那么一开始就不是问题?
答案1
尝试这个:
check host Tomcat-Foo with address localhost every 2 cycles
...
当 monit 执行检查时,它只会每 2 个周期检查一次 Tomcat-Foo,从而为其提供更多启动时间。如果您需要更多/更少的时间,请调整周期数。
答案2
这是我们目前采用的“略显”黑客式的解决方案。基本上,如果 Tomcat 在周期内没有启动,因此再次重新启动(一次又一次……),则检查if N restarts
将运行一个脚本,关闭监控一段时间。
我们还更改了 monit 配置以针对 tomcat 进程,因此它不是只是主机检查。
监控配置
check process Tomcat with pidfile /opt/tomcat/current/bin/catalina.pid
stop program = "/usr/bin/systemctl stop tomcat.service"
start program = "/usr/bin/systemctl start tomcat.service"
if failed host localhost port 8081
protocol http request "/productconfigurator/"
for 3 times within 5 cycles
then restart
if 2 restarts within 3 cycles
then exec "/etc/monit-wait.sh tomcat 5m"
monit-wait.sh
#!/bin/bash
monit unmonitor $1
sleep $2
monit monitor $1
不是特别漂亮,但至少看起来可以工作。当然,替代方案可能是使用此脚本作为失败的操作,但是是的……无论如何,仍然欢迎更好的建议 :)
答案3
只需在启动脚本中添加重启超时和睡眠即可。出于某种原因,启动命令中的“&& sleep 5m”不起作用。最好能找到一种延迟启动命令的方法。
还要注意,如果在 Tomcat 前面有 Apache,则主机检查将始终成功!.. 因此下面的 http-check.sh 通过检查关键字来工作。
/etc/monit/bin/tomcatstart.sh
#!/bin/bash
/usr/sbin/service tomcat8 start
sleep 5m
在/etc/monit/conf-enabled/tomcat8
check program http-check with path "/etc/monit/bin/http-check.sh"
group tomcat8
start program = "/etc/monit/bin/tomcatstart.sh" with timeout 450 seconds
stop program = "/usr/sbin/service tomcat8 stop"
if status != 0 for 2 times within 2 cycles
then restart
/etc/monit/bin/http-check.sh
#!/bin/bash
RESULT="`wget -qO- https://www.host.com`"
if [[ $RESULT == *"Contact"* ]]
then
exit 0
else
exit 1
fi
按预期工作,等待 5 分钟而无需再尝试。
[EDT May 30 13:27:56] error : 'http-check' '/etc/monit/bin/http-check.sh' failed with exit status (1) -- no output
[EDT May 30 13:27:56] info : 'http-check' trying to restart
[EDT May 30 13:27:56] info : 'http-check' stop: /usr/sbin/service
[EDT May 30 13:27:56] info : 'http-check' start: /etc/monit/tomcatstart.sh
[EDT May 30 13:34:01] error : 'http-check' '/etc/monit/bin/http-check.sh' failed with exit status (1) -- no output
[EDT May 30 13:34:01] info : 'http-check' trying to restart
[EDT May 30 13:34:01] info : 'http-check' stop: /usr/sbin/service
[EDT May 30 13:34:02] info : 'http-check' start: /etc/monit/tomcatstart.sh