我尽量让自己清楚:我的大脑即将爆炸,就像那些爆炸的小猫一样。
两台机器都是 Centos 7:
[[email protected]]# cat /proc/version
Linux version 3.10.0-693.11.6.el7.x86_64 ([email protected]) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-16) (GCC) ) #1 SMP Thu Jan 4 01:06:37 UTC 2018
来自 EPEL 的最新 NRPE:
[[email protected]]# ./check_nrpe -H 192.168.10.2
NRPE v3.2.0
我正在尝试从 nagios 服务器重新启动服务,以便设置事件处理程序。一切都从大量脚本开始,但现在我将问题缩小到:
[[email protected]]# ./check_nrpe -H 192.168.10.2 -c restart
NRPE: Unable to read output
[[email protected]]# ./check_nrpe -H 192.168.10.2 -c status
(... correct service status output ...)
Loaded: loaded (/usr/lib/systemd/system/cachefilesd.service
(... correct service status output ...)
因此,我可以状态服务,但无法启动或重新启动。
[[email protected]]# cat /etc/nagios/nrpe.conf:
[...]
nrpe_user=nrpe
nrpe_group=nrpe
allowed_hosts=127.0.0.1,192.168.10.1
command[status]=/lib64/nagios/plugins/status.sh
command[restart]=/lib64/nagios/plugins/restart.sh
[...]
[[email protected]]# cat /lib64/nagios/plugins/status.sh
#!/bin/bash
sudo systemctl status cachefilesd
exit 0
和
[[email protected]]# cat /lib64/nagios/plugins/restart.sh
#!/bin/bash
sudo systemctl restart cachefilesd
exit 0
sudo 命令:
[[email protected]]# cat /etc/sudoers
# Defaults specification
Defaults: nrpe !requiretty
Defaults: nagios !requiretty
nagios ALL = NOPASSWD: /sbin/service,/usr/bin/systemctl,/usr/sbin/service
nrpe ALL = NOPASSWD: /sbin/service,/usr/bin/systemctl,/usr/sbin/service
如果我输入:
[[email protected]]# sudo -u nrpe -H ./restart-cachefilesd.sh
一切皆好。
我在 NRPE 中启用了调试,并得到:
nrpe[5431]: Host address is in allowed_hosts
nrpe[5431]: Host 192.168.10.1 is asking for command 'restart' to be run...
nrpe[5431]: Running command: /lib64/nagios/plugins/restart.sh
nrpe[5432]: WARNING: my_system() seteuid(0): Operation not permitted
nrpe[5431]: Command completed with return code 0 and output:
nrpe[5431]: Return Code: 3, Output: NRPE: Unable to read output
nrpe[5431]: Connection from 192.168.10.1 closed.
我尝试了strace
输出,但是对我来说太多了......
答案1
您不应该将 sudo 放在脚本里面,而应该将 sudo 包含在 nrpe.cfg 文件中:
command[status]=sudo /lib64/nagios/plugins/status.sh
代替
command[status]=/lib64/nagios/plugins/status.sh