我已经设置了 Munin 服务器和警报,并对其进行了测试。我已设置磁盘使用情况警报,如下所示:
df._dev_mapper_centos_root.warning 90
df._dev_md126p2.warning 90
df._dev_md126p1.warning 90
df._dev_mapper_centos_home.warning 90
我已在电子邮件中收到上述警报(为了测试,我保留了较低的值):
> sha :: Server2 :: Disk usage in percent
> WARNINGs: /boot is 33.48 (outside range [:33]), / is 17.95 (outside range [:17]), /boot/efi is 4.73 (outside range [:4]).
>
> sha :: Server1 :: Disk usage in percent
> OKs: /boot is 33.48, / is 17.95, /boot/efi is 4.73
我现在面临的问题是,我收到了磁盘延迟警报,但我找不到任何值来更改警报。以下是 Munin 触发的几个警报:
> sha :: Server1 :: Disk latency per device :: Average latency
> for /dev/centos/swap
> WARNINGs: Write IO Wait time is 4.89 (outside range [0:3]).
>
> sha :: Server1 :: Disk latency per device :: Average latency
> for /dev/centos/home
> WARNINGs: Write IO Wait time is 10.64 (outside range [0:3])
。
尽管此服务器存在每个设备的磁盘延迟图表,但是当我通过 telnet 连接到节点时,我没有获得任何插件来获取值:
telnet 192.168.10.252 4949
Trying 192.168.10.252...
Connected to 192.168.10.252.
Escape character is '^]'.
# munin node at localhost.localdomain
list
acpi cpu df df_inode entropy exim_mailqueue forks fw_conntrack
fw_forwarded_local fw_packets hddtemp_smartctl if_enp2s0 if_err_enp2s0
interrupts irqstats load memory netstat open_files open_inodes
postfix_mailqueue proc_pri processes swap threads uptime users vmstat
我不确定我是否解释得当,如果您认为这是一个愚蠢的问题,我很抱歉。我只想完全停止这些警报或将值设置为高。我希望在这里能得到一些帮助。
答案1
这可能是磁盘统计信息_延迟插件,请尝试以下操作:
diskstats_latency.centos_home.avgwrwait.warning 0:15
diskstats_latency.centos_home.avgrdwait.warning 0:15
diskstats_latency.centos_swap.avgwrwait.warning 0:15
diskstats_latency.centos_swap.avgrdwait.warning 0:15
请注意,这适用于写入(平均等待时间)并阅读(平均等待时间) 潜伏。
我将范围设置为 0:15,这将几乎完全按照您的需要禁用警告。
不要忘记重新启动 munin 守护进程
systemctl restart munin-node