我正在尝试使用 Nagios NRPE 插件与我的服务器通信。我有一个/etc/nagios/nrpe_local.cfg
使用它的命令定义:
command[check_service]=/usr/lib/nagios/plugins/check_service -s $ARG1$
当我在终端中手动运行该命令时,它会成功:
# /usr/lib/nagios/plugins/check_service -s bind9
OK: Service bind9 is running!
当我尝试从我的 Nagios 服务器运行它时,它抱怨命令未定义:
# /usr/lib/nagios/plugins/check_nrpe -H 10.32.10.3 -c check_service -a bind9
NRPE: Command 'check_service!bind9' not defined
其他check_nrpe
命令有效,所以我不认为服务器存在问题commands.cfg
,但无论如何这里是定义:
define command {
command_name check_nrpe
command_line /usr/lib/nagios/plugins/check_nrpe -H $HOSTADDRESS$ -t 30 -c $ARG1$
}
该检查在服务器上看起来如下:
define service {
use local-service
host_name dc1,dc2
service_description BIND Service
check_command check_nrpe!check_service!bind9
}
在 Web 界面上,它返回严重:服务未运行!,但事实并非如此。
我怎样才能check_nrpe
允许一个额外的参数?我尝试启用dont_blame_nrpe
,但也没有允许它运行。
编辑- 打开调试并重新运行检查后,我在系统日志中收到以下内容:
Dec 19 09:01:56 dc1 nrpe[5586]: CONN_CHECK_PEER: checking if host is allowed: 10.32.10.12 port 33962
Dec 19 09:01:56 dc1 nrpe[5586]: Connection from 10.32.10.12 port 33962
Dec 19 09:01:56 dc1 nrpe[5586]: is_an_allowed_host (AF_INET): is host >10.32.10.12< an allowed host >10.32.10.12<
Dec 19 09:01:56 dc1 nrpe[5586]: is_an_allowed_host (AF_INET): is host >10.32.10.12< an allowed host >10.32.10.12<
Dec 19 09:01:56 dc1 nrpe[5586]: is_an_allowed_host (AF_INET): host is in allowed host list!
Dec 19 09:01:56 dc1 nrpe[5586]: Host address is in allowed_hosts
Dec 19 09:01:56 dc1 nrpe[5586]: Host 10.32.10.12 is asking for command 'check_service' to be run...
Dec 19 09:01:56 dc1 nrpe[5586]: Running command: /usr/lib/nagios/plugins/check_service -s
Dec 19 09:01:56 dc1 nrpe[5587]: WARNING: my_system() seteuid(0): Operation not permitted
Dec 19 09:01:56 dc1 nrpe[5586]: Command completed with return code 2 and output: CRITICAL: Service is not running!
Dec 19 09:01:56 dc1 nrpe[5586]: Return Code: 2, Output: CRITICAL: Service is not running!
Dec 19 09:01:56 dc1 nrpe[5586]: Connection from 10.32.10.12 closed.
我已验证 中的组与中的参数/etc/systemd/system/multi-user.target.wants/nagios-nrpe-server.service
匹配。和中存在同一个用户。nrpe_group
/etc/nagios/nrpe.cfg
/etc/group
/etc/passwd
答案1
- 确保在启用后重新启动 nrpe 守护进程
dont_blame_nrpe
- 将nrpe.cfg中的指令更改
debug
为1并重新启动守护进程。此后,您应该在日志中获取有用的调试信息。
答案2
--enable-command-args
问题是 Debian 包在构建包时没有设置nagios-nrpe-server
,而这是使用所需的dont_blame_nrpe
。