仅当进程状态更改时发送电子邮件警报的脚本

仅当进程状态更改时发送电子邮件警报的脚本

下面的脚本检查 MStrsvr 进程是否正在运行。我面临的问题是,如果我安排一个 cron 选项卡每 1 小时运行此脚本,它将每 1 小时发出一次电子邮件警报,提示“MSTRSvr 正在运行”,这是我不想要的。我希望脚本仅在服务器停止/启动时发出警报。

#!/bin/ksh
hos=$(hostname)

curr_Dt=$(date +"%Y-%m-%d %H:%M:%S")

var=$(ps -ef | grep -i '[/]MSTRSvr')

if [ -z "$var" ]
then

    echo "ALERT TIME : $curr_Dt" >>wa.txt
    echo "SERVER NAME : $hos" >>wa.txt
    echo "\n \n" >>wa.txt
    echo " MSTRSvr is not running on $hos Please check for possible impact " >>wa.txt
    echo "\n \n" >>wa.txt

    mail -s "MSTRSvr process ALERT" [email protected] <wa.txt

else

    echo "MSTRSvr is running" >>mi.txt

    mail -s "MSTRSvr process ALERT" [email protected] <mi.txt

fi

rm wa.txt 2>ni.txt
rm mi.txt 2>ni.txt

答案1

#-----------------------------------------------------------------------
#!/bin/ksh

hos=$(hostname)
curr_Dt=$(date +"%Y-%m-%d %H:%M:%S")

# I am going to get the process ID for the MSTRSvr.
ProcessPID=$(ps -ef | grep -i '[/]MSTRSvr' | grep -v grep | awk '{print $2}') 

if [[ -z ${ProcessPID} ]]; then
    # There is no PID, Not running!
    echo "ALERT TIME : $curr_Dt" >>wa.txt
    echo "SERVER NAME : $hos" >>wa.txt
    echo "\n \n" >>wa.txt
    echo " MSTRSvr is not running on $hos Please check for possible impact " >>wa.txt
    echo "\n \n" >>wa.txt
    mail -s "MSTRSvr process ALERT" [email protected] <wa.txt
else
    # The process is running check it against the last recorded PID.
    # You can also compare /tmp/MSTRSvr.pid with ${ProcessPID}.
    kill -0 `cat /tmp/MSTRSvr.pid` > /dev/null 2>&1
    if [[ $? -ne 0 ]]; then
       # The current PID does not match.
       echo "MSTRSvr was restarted." >>mi.txt
       # Update the tempfile with current running PID.
       echo ${ProcessPID}>/tmp/MSTRSvr.pid
       mail -s "MSTRSvr process ALERT" [email protected] <mi.txt
    fi
fi

rm wa.txt 2>ni.txt
rm mi.txt 2>ni.txt
#---------------------------------------------------------------------

在第一次运行此脚本之前,创建 /tmp/MSTRSvr.pid 文件并将“999999999”(随机数)添加到该文件中,“else”命令下的检查将失败,您将收到电子邮件,提示“MSTRSvr 已重新启动” ' 忽略它并继续...

因此,每个间隔脚本都会检查 PID,然后根据最后一个已知的 PID 进行检查。

答案2

添加对服务器最后状态的测试:

#!/bin/ksh
hos=$(hostname)

curr_Dt=$(date +"%Y-%m-%d %H:%M:%S")

var=$(ps -ef | grep -i '[/]MSTRSvr')

if [ -z "$var" ]
then
    echo "ALERT TIME : $curr_Dt" >>wa.txt
    echo "SERVER NAME : $hos" >>wa.txt
    echo "\n \n" >>wa.txt
    echo " MSTRSvr is not running on $hos Please check for possible impact " >>wa.txt
    echo "\n \n" >>wa.txt

    echo "stopped" > "filewithlaststate.txt"

    mail -s "MSTRSvr process ALERT" [email protected] <wa.txt

else

    if [ "$(cat "filewithlaststate.txt")" != "running" ]
    then 
         echo "MSTRSvr is running" >>mi.txt

         echo "running" > "filewithlaststate.txt"

         mail -s "MSTRSvr process ALERT" [email protected] <mi.txt
    fi

fi

rm wa.txt 2>ni.txt
rm mi.txt 2>ni.txt

相关内容