Nagios 通知定义

Nagios 通知定义

我正在尝试以某种方式监视 Web 服务器,即我想通过 http 在页面上搜索特定字符串。该命令在 command.cfg 中定义如下

# 'check_http-mysite command definition'
define command {
        command_name check_http-mysite
        command_line /usr/lib/nagios/plugins/check_http -H mysite.example.com -s "Some text" }

# 'notify-host-by-sms' command definition
define command {
        command_name  notify-host-by-sms 
        command_line  /usr/bin/send_sms $CONTACTPAGER$ "Nagios - $NOTIFICATIONTYPE$ :Host$HOSTALIAS$ is $HOSTSTATE$ ($OUTPUT$)"
}
# 'notify-service-by-sms' command definition
define command {
        command_name  notify-service-by-sms 
        command_line  /usr/bin/send_sms $CONTACTPAGER$ "Nagios - $NOTIFICATIONTYPE$: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ ($OUTPUT$)"
}

现在,如果 nagios 在主页 mysite.example.com 上找不到“某些文本”,nagios 应该通过 Clickatell http API 通过短信通知联系人,我有一个脚本,我已经测试过并且发现它运行良好。

每当我更改命令定义以搜索页面上没有的字符串并重新启动 nagios 时,我都会在 Web 界面上看到未找到该字符串。我不明白的是,为什么没有发送通知,尽管我已经定义了主机、主机组、联系人、联系人组和服务等等。我缺少的是,这些是我的定义,在我通过 cgi 进行 Web 访问时,我可以看到我已经定义并启用了通知,尽管在硬状态更改期间我没有收到电子邮件和短信通知。

主机配置文件

define host {
        use                     generic-host
        host_name               HAL
        alias                   IBM-1
        address                 xxx.xxx.xxx.xxx
        check_command           check_http-mysite     
}

主机组_nagios2.cfg

# my website
define hostgroup{
       hostgroup_name  my-servers
       alias           All My Servers
       members         HAL 
}

contacts_nagios2.cfg

define contact {
        contact_name                    colin   
        alias                           Colin Y
        service_notification_period     24x7
        host_notification_period        24x7
        service_notification_options    w,u,c,r,f,s
        host_notification_options       d,u,r,f,s
        service_notification_commands   notify-service-by-email,notify-service-by-sms
        host_notification_commands      notify-host-by-email,notify-host-by-sms
        email                           [email protected]
        pager                           +254xxxxxxxxx
}

define contactgroup{
        contactgroup_name   site_admin 
        alias               Site Administrator
        members             colin 
}

服务_nagios2.cfg

# check for particular string in page via http 
define service {
        hostgroup_name                  my-servers
        service_description             STRING CHECK
        check_command                   check_http-mysite
        use                             generic-service
        notification_interval           0 ; set > 0 if you want to be renotified
        contacts                        colin
        contact_groups                  site_admin
}

有人能告诉我哪里错了吗?

以下是通用主机和通用服务的定义

通用服务_nagios2.cfg

# generic service template definition
define service{
        name                            generic-service ; The 'name' of this service template
        active_checks_enabled           1       ; Active service checks are enabled
        passive_checks_enabled          1       ; Passive service checks are enabled/accepted
        parallelize_check               1       ; Active service checks should be parallelized (disabling this can lead to major performance problems)
        obsess_over_service             1       ; We should obsess over this service (if necessary)
        check_freshness                 0       ; Default is to NOT check service 'freshness'
        notifications_enabled           1       ; Service notifications are enabled
        event_handler_enabled           1       ; Service event handler is enabled
        flap_detection_enabled          1       ; Flap detection is enabled
        failure_prediction_enabled      1       ; Failure prediction is enabled
        process_perf_data               1       ; Process performance data
        retain_status_information       1       ; Retain status information across program restarts
        retain_nonstatus_information    1       ; Retain non-status information across program restarts
                notification_interval           0               ; Only send notifications on status change by default.
                is_volatile                     0
                check_period                    24x7
                normal_check_interval           5
                retry_check_interval            1
                max_check_attempts              4
                notification_period             24x7
                notification_options            w,u,c,r
                contact_groups                  site_admin
        register                        0       ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
}

通用主机_nagios2.cfg

define host{
        name                            generic-host    ; The name of this host template
        notifications_enabled           1       ; Host notifications are enabled
        event_handler_enabled           1       ; Host event handler is enabled
        flap_detection_enabled          1       ; Flap detection is enabled
        failure_prediction_enabled      1       ; Failure prediction is enabled
        process_perf_data               1       ; Process performance data
        retain_status_information       1       ; Retain status information across program restarts
        retain_nonstatus_information    1       ; Retain non-status information across program restarts
                max_check_attempts              10
                notification_interval           0
                notification_period             24x7
                notification_options            d,u,r
                contact_groups                  site_admin 
        register                        1       ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
}

答案1

我搞明白了,实际上配置没问题,问题是 nagios 以用户“nagios”的身份执行 SMS 脚本,而该用户没有权限写入 /tmp/ 中的日志文件。但我读过的所有关于通过 SMS 设置 nagios 通知的博客都没有解释这一点。我不得不自己找出答案,差点让我头晕目眩。

相关内容