我正在尝试以某种方式监视 Web 服务器,即我想通过 http 在页面上搜索特定字符串。该命令在 command.cfg 中定义如下
# 'check_http-mysite command definition'
define command {
command_name check_http-mysite
command_line /usr/lib/nagios/plugins/check_http -H mysite.example.com -s "Some text" }
# 'notify-host-by-sms' command definition
define command {
command_name notify-host-by-sms
command_line /usr/bin/send_sms $CONTACTPAGER$ "Nagios - $NOTIFICATIONTYPE$ :Host$HOSTALIAS$ is $HOSTSTATE$ ($OUTPUT$)"
}
# 'notify-service-by-sms' command definition
define command {
command_name notify-service-by-sms
command_line /usr/bin/send_sms $CONTACTPAGER$ "Nagios - $NOTIFICATIONTYPE$: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ ($OUTPUT$)"
}
现在,如果 nagios 在主页 mysite.example.com 上找不到“某些文本”,nagios 应该通过 Clickatell http API 通过短信通知联系人,我有一个脚本,我已经测试过并且发现它运行良好。
每当我更改命令定义以搜索页面上没有的字符串并重新启动 nagios 时,我都会在 Web 界面上看到未找到该字符串。我不明白的是,为什么没有发送通知,尽管我已经定义了主机、主机组、联系人、联系人组和服务等等。我缺少的是,这些是我的定义,在我通过 cgi 进行 Web 访问时,我可以看到我已经定义并启用了通知,尽管在硬状态更改期间我没有收到电子邮件和短信通知。
主机配置文件
define host {
use generic-host
host_name HAL
alias IBM-1
address xxx.xxx.xxx.xxx
check_command check_http-mysite
}
主机组_nagios2.cfg
# my website
define hostgroup{
hostgroup_name my-servers
alias All My Servers
members HAL
}
contacts_nagios2.cfg
define contact {
contact_name colin
alias Colin Y
service_notification_period 24x7
host_notification_period 24x7
service_notification_options w,u,c,r,f,s
host_notification_options d,u,r,f,s
service_notification_commands notify-service-by-email,notify-service-by-sms
host_notification_commands notify-host-by-email,notify-host-by-sms
email [email protected]
pager +254xxxxxxxxx
}
define contactgroup{
contactgroup_name site_admin
alias Site Administrator
members colin
}
服务_nagios2.cfg
# check for particular string in page via http
define service {
hostgroup_name my-servers
service_description STRING CHECK
check_command check_http-mysite
use generic-service
notification_interval 0 ; set > 0 if you want to be renotified
contacts colin
contact_groups site_admin
}
有人能告诉我哪里错了吗?
以下是通用主机和通用服务的定义
通用服务_nagios2.cfg
# generic service template definition
define service{
name generic-service ; The 'name' of this service template
active_checks_enabled 1 ; Active service checks are enabled
passive_checks_enabled 1 ; Passive service checks are enabled/accepted
parallelize_check 1 ; Active service checks should be parallelized (disabling this can lead to major performance problems)
obsess_over_service 1 ; We should obsess over this service (if necessary)
check_freshness 0 ; Default is to NOT check service 'freshness'
notifications_enabled 1 ; Service notifications are enabled
event_handler_enabled 1 ; Service event handler is enabled
flap_detection_enabled 1 ; Flap detection is enabled
failure_prediction_enabled 1 ; Failure prediction is enabled
process_perf_data 1 ; Process performance data
retain_status_information 1 ; Retain status information across program restarts
retain_nonstatus_information 1 ; Retain non-status information across program restarts
notification_interval 0 ; Only send notifications on status change by default.
is_volatile 0
check_period 24x7
normal_check_interval 5
retry_check_interval 1
max_check_attempts 4
notification_period 24x7
notification_options w,u,c,r
contact_groups site_admin
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
}
通用主机_nagios2.cfg
define host{
name generic-host ; The name of this host template
notifications_enabled 1 ; Host notifications are enabled
event_handler_enabled 1 ; Host event handler is enabled
flap_detection_enabled 1 ; Flap detection is enabled
failure_prediction_enabled 1 ; Failure prediction is enabled
process_perf_data 1 ; Process performance data
retain_status_information 1 ; Retain status information across program restarts
retain_nonstatus_information 1 ; Retain non-status information across program restarts
max_check_attempts 10
notification_interval 0
notification_period 24x7
notification_options d,u,r
contact_groups site_admin
register 1 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
}
答案1
我搞明白了,实际上配置没问题,问题是 nagios 以用户“nagios”的身份执行 SMS 脚本,而该用户没有权限写入 /tmp/ 中的日志文件。但我读过的所有关于通过 SMS 设置 nagios 通知的博客都没有解释这一点。我不得不自己找出答案,差点让我头晕目眩。