Nagios 未发送电子邮件。我看到日志显示它显示警报,但没有发送电子邮件。有什么建议可以调试此问题吗?
/var/log/maillog 没有显示任何日志条目。通过命令行手动发送电子邮件确实到达了我的收件箱。
日志和配置:
[1549711074] SERVICE FLAPPING ALERT: host001;Disk Space - /boot;STARTED; Service appears to have started flapping (21.6% change >= 20.0% threshold)
[1549711074] SERVICE FLAPPING ALERT: host001;Disk Space Warn Only - /boot;STARTED; Service appears to have started flapping (21.6% change >= 20.0% threshold)
[1549711194] SERVICE ALERT: host001;Disk Space - /boot;CRITICAL;SOFT;1;/boot: 100%used(98MB/99MB) (>90%) : CRITICAL
[1549711194] SERVICE ALERT: host001;Disk Space Warn Only - /boot;CRITICAL;SOFT;1;/boot: 100%used(98MB/99MB) (>90%) : CRITICAL
[1549711254] SERVICE ALERT: host001;Disk Space - /boot;CRITICAL;SOFT;2;/boot: 100%used(98MB/99MB) (>90%) : CRITICAL
[1549711254] SERVICE ALERT: host001;Disk Space Warn Only - /boot;CRITICAL;SOFT;2;/boot: 100%used(98MB/99MB) (>90%) : CRITICAL
[1549711314] SERVICE ALERT: host001;Disk Space - /boot;CRITICAL;HARD;3;/boot: 100%used(98MB/99MB) (>90%) : CRITICAL
[1549711314] SERVICE ALERT: host001;Disk Space Warn Only - /boot;CRITICAL;HARD;3;/boot: 100%used(98MB/99MB) (>90%) : CRITICAL
[1549711387] Caught SIGTERM, shutting down...
[1549711387] Successfully shutdown... (PID=28697)
[1549711387] Warning: aggregate_status_updates directive ignored. All status file updates are now aggregated.
[1549711387] Nagios 3.0.6 starting... (PID=29699)
[1549711387] Local time is Sat Feb 09 03:23:07 PST 2019
[1549711387] LOG VERSION: 2.0
[1549711387] Finished daemonizing... (New PID=29700)
[1549711387] SERVICE FLAPPING ALERT: host001;Disk Space - /boot;STARTED; Service appears to have started flapping (27.3% change >= 20.0% threshold)
[1549711387] SERVICE FLAPPING ALERT: host001;Disk Space Warn Only - /boot;STARTED; Service appears to have started flapping (27.3% change >= 20.0% threshold)
[1549712107] SERVICE ALERT: mysql-db03;eth0 status;UNKNOWN;SOFT;1;ERROR: No snmp response from 10.49.64.62 (alarm)
[1549712107] SERVICE ALERT: mysql-db03;eth1 status;UNKNOWN;HARD;3;ERROR: No snmp response from 10.49.64.62 (alarm)
[1549712157] SERVICE ALERT: mysql-db03;eth0 status;OK;SOFT;2;OK: Interface eth0 (index 2) is up.
[1549712277] SERVICE ALERT: mysql-db03;eth1 status;CRITICAL;HARD;3;CRITICAL: Interface eth1 (index 3) is administratively down.
[1549712277] SERVICE NOTIFICATION: rt;mysql-db03;eth1 status;CRITICAL;ngmail;CRITICAL: Interface eth1 (index 3) is administratively down.
[1549712292] SERVICE NOTIFICATION: 724_shift11;mysql-db03;eth1 status;CRITICAL;ngmail;CRITICAL: Interface eth1 (index 3) is administratively down.
[1549712307] SERVICE NOTIFICATION: skytel1;mysql-db03;eth1 status;CRITICAL;ngmail;CRITICAL: Interface eth1 (index 3) is administratively down.
[1549712322] SERVICE NOTIFICATION: skytel2;mysql-db03;eth1 status;CRITICAL;ngmail;CRITICAL: Interface eth1 (index 3) is administratively down.
[1549712337] SERVICE NOTIFICATION: skytel4;mysql-db03;eth1 status;CRITICAL;ngmail;CRITICAL: Interface eth1 (index 3) is administratively down.
[1549712352] SERVICE NOTIFICATION: skytel6;mysql-db03;eth1 status;CRITICAL;ngmail;CRITICAL: Interface eth1 (index 3) is administratively down.
[1549712367] SERVICE NOTIFICATION: skytel7;mysql-db03;eth1 status;CRITICAL;ngmail;CRITICAL: Interface eth1 (index 3) is administratively down.
[1549712382] SERVICE NOTIFICATION: pubfolders;mysql-db03;eth1 status;CRITICAL;notify-by-email;CRITICAL: Interface eth1 (index 3) is administratively down.
和通知配置:
# 'notify-by-email' command definition
define command{
command_name notify-by-email
command_line /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nNotification Number : $NOTIFICATIONNUMBER$\nProblem Duration: $SERVICEDURATION$\n\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $SHORTDATETIME$\n\nAdditional Info:\n$SERVICEOUTPUT$\n\n" | /bin/mail -r $ADMINEMAIL$ -s "**$NOTIFICATIONTYPE$ alert #$NOTIFICATIONNUMBER$ - $HOSTALIAS$:$SERVICEDESC$ is $SERVICESTATE$**" $CONTACTEMAIL$
}
# 'host-notify-by-email' command definition
define command{
command_name host-notify-by-email
command_line /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nNotification Number : $NOTIFICATIONNUMBER$\nProblem Duration: $HOSTDURATION$\n\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nDate/Time: $SHORTDATETIME$\n\nAdditional Info: \n$HOSTOUTPUT$\n\n" | /bin/mail -r $ADMINEMAIL$ -s "HOST DOWN alert #$NOTIFICATIONNUMBER$ - $HOSTNAME$ is $HOSTSTATE$" $CONTACTEMAIL$
联系人.cfg
define contact{
contact_name ops
alias Ops Email
service_notification_period 24x7
host_notification_period 24x7
service_notification_options w,u,c,r
host_notification_options d,u,r
service_notification_commands notify-by-email
host_notification_commands host-notify-by-email
email [email protected]
}
答案1
当我停止 nrpe 服务时,它发出了警报。看来 contact.cfg 仅设置为发送 Down、Unreachable 和 Recovery 警报。