当驱动器出现故障时，如何让我的 HP 服务器向我发送电子邮件？

Question 1

这稍微取决于您在服务器上运行的操作系统，但一般来说，可以从 HP ProLiant 服务器和 Smart Array RAID 控制器获取警报。

完整的驱动程序和软件支持列表DL380 G5 系统在此处列出。

SNMP 和监控解决方案是最好的方法...但您可以使用一些 HP 工具来增强它。HP 提供HP 系统洞察管理器，可供下载，并随服务器提供。这对于服务器集合来说非常理想。如果您正在寻找一次性警报，而无需构建管理或监控基础设施，您只需安装HP 管理代理（又名ProLiant 支持包）。

对于独立的 Linux 系统，我会让代理通过电子邮件发送陷阱。我通常会使用默认值或定制包，然后编辑/opt/hp/hp-snmp-agents/cma.conf并更改trapemail为指向收件人地址：

########################################################################
# trapemail is used for configuring email command(s) which will be
# executed whenever a SNMP trap is generated.
# Multiple trapemail lines are allowed.
# Note: any command that reads standard input can be used. For example:
#             trapemail /usr/bin/logger
#       will log trap messages into system log (/var/log/messages).
########################################################################
trapemail /bin/mail -s 'HP Insight Management Agents Trap Alarm' [email protected]

如果你正在运行 Linux，并且不想安装完整的 HP 管理套件，你可以围绕cciss_vol_status实用程序来查询控制器/磁盘状态。另请参阅：在 OpenFiler 上安装 HP 代理

Answer

这稍微取决于您在服务器上运行的操作系统，但一般来说，可以从 HP ProLiant 服务器和 Smart Array RAID 控制器获取警报。

完整的驱动程序和软件支持列表DL380 G5 系统在此处列出。

SNMP 和监控解决方案是最好的方法...但您可以使用一些 HP 工具来增强它。HP 提供HP 系统洞察管理器，可供下载，并随服务器提供。这对于服务器集合来说非常理想。如果您正在寻找一次性警报，而无需构建管理或监控基础设施，您只需安装HP 管理代理（又名ProLiant 支持包）。

对于独立的 Linux 系统，我会让代理通过电子邮件发送陷阱。我通常会使用默认值或定制包，然后编辑/opt/hp/hp-snmp-agents/cma.conf并更改trapemail为指向收件人地址：

########################################################################
# trapemail is used for configuring email command(s) which will be
# executed whenever a SNMP trap is generated.
# Multiple trapemail lines are allowed.
# Note: any command that reads standard input can be used. For example:
#             trapemail /usr/bin/logger
#       will log trap messages into system log (/var/log/messages).
########################################################################
trapemail /bin/mail -s 'HP Insight Management Agents Trap Alarm' [email protected]

如果你正在运行 Linux，并且不想安装完整的 HP 管理套件，你可以围绕cciss_vol_status实用程序来查询控制器/磁盘状态。另请参阅：在 OpenFiler 上安装 HP 代理

Question 2

查看 HP Insight Manager

https://www.hpe.com/us/en/product-catalog/detail/pip.489496.html#

我相信它应该可以适用于你的服务器。

Answer

查看 HP Insight Manager

https://www.hpe.com/us/en/product-catalog/detail/pip.489496.html#

我相信它应该可以适用于你的服务器。

Question 3

我使用了@ewwite 在他的回答中提到的轻量级程序： cciss_vol_status

如果您按照随附的 INSTALL 说明进行操作，脚本将被放置在中/usr/local/bin/cciss_vol_status。

这是一个包装脚本，我用它来 grep cciss_vol_status 的输出，如果任何阵列的状态为 FAILED，则发送电子邮件。

#!/bin/bash
#
# Check status of RAID volumes on HP Smart Array controllers.  Send an email
# alert if any volumes have a FAILED status.
#
status=`/usr/local/bin/cciss_vol_status /dev/sd*`

# email lock file
lockfile=/tmp/raid.check.hp.smartarray.lock
# how often to send an email (minutes)
_notification_freq=59
_host=`hostname`
# To: email
_toemail=root

# create email lock file
[ ! -f ${lockfile} ] && /bin/touch ${lockfile}

if echo $status | grep -q FAILED
then
    # make sure we haven't sent a notification in the last X minutes
    if test `find ${lockfile} -mmin +${_notification_freq}`
    then
        echo -e "${status}" | /bin/mail -s "System Alert! RAID failure on ${_host}" ${_toemail}

        # update lock file mod time
        /bin/touch ${lockfile}
    fi
fi

在 cron 中调用上述脚本。我每两分钟运行一次检查：

*/2 * * * * /usr/local/bin/raid.check.hp.smartarray.sh

我们确实使用HP 系统洞察管理器检查我们的 HP 是否正常运行，仅此而已。我发现 Linux 代理对我们来说有点过头了，因为我们有其他监控解决方案，所以上面的这个脚本很好地满足了它的特定目的。

更新

如果您遇到这种情况，这只是一个故障排除提示。今天早上，当我收到一封有关阵列故障的电子邮件时，这个脚本非常有用：

已达到缓存脏限制

该设备变为只读状态，在中不可见/proc/partitions。我重新启动了服务器并在启动时看到这些消息：

由于可能丢失数据，逻辑驱动器被禁用。选择“F1”继续禁用逻辑驱动器。选择“F2”接受数据丢失并重新启用逻辑驱动器

我选择了 F2，RAID 一切正常并在启动时安装。

Answer