通知后,Nagios check_log 重置为 OK

通知后,Nagios check_log 重置为 OK

Fail2ban 安装在其中一台服务器上,我可以禁止每个尝试登录的 IP。

我正在使用 Nagios 的 check_log 插件来捕获 fail2ban.log 的最后一行,它与 fail2ban.log.old 进行比较,并在检测到单词“ban”时报告警告状态。

它能正常工作,问题是我不能让它只进行一次检查,然后重置为正常状态。它不断向我发送有关同一被禁 IP 的电子邮件(并显示警告),(现在不再被禁了)。

我在 Nagios 服务器上创建了一个名为 log-service 的服务,用于实现这个功能。

passive_checks_enabled 0
is_volatile 1
max_check_attempts 1
retry_check_interval 2

它确实每 2 分钟检查一次服务,但总是报告警告状态(目前我收集了 19 封电子邮件)。

有人能告诉我哪里出了问题吗?如果我的方法不是最好的,有更好的方法吗?我认为没有必要发布任何日志,但如果你问我,我就会发布它。

编辑:服务定义(重新编辑)

define service{
        name                            log-service
        active_checks_enabled           1       ; Active service checks are enabled
        passive_checks_enabled          0       ; Passive service checks are enabled/accepted
        parallelize_check               1       ; Active service checks should be parallelized (disabling this can lead to major performance problems)
        obsess_over_service             1       ; We should obsess over this service (if necessary)
        check_freshness                 0       ; Default is to NOT check service 'freshness'
        notifications_enabled           1       ; Service notifications are enabled
        event_handler_enabled           1       ; Service event handler is enabled
        flap_detection_enabled          0       ; Flap detection is enabled
        process_perf_data               1       ; Process performance data
        retain_status_information       1       ; Retain status information across program restarts
        retain_nonstatus_information    1       ; Retain non-status information across program restarts
        is_volatile                     1       ; The service is not volatile
        check_period                    24x7    ; The service can be checked at any time of the day
        max_check_attempts              1       ; Re-check the service up to 1 time in order to determine its final (hard) state
        check_interval           2      ; Check the service every 2 minutes under normal conditions
        retry_interval            2     ; Re-check the service every two minutes until a hard state can be determined
        contact_groups                  admins  ; Notifications get sent out to everyone in the 'admins' group
        notification_options            w,u,c,r ; Send notifications about warning, unknown, critical, and recovery events
        notification_interval           0       ; Re-notify about service problems every hour
        notification_period             24x7    ; Notifications can be sent out at any time
        register                        0
        }

相关内容