How to automate Unix servers performance monitoring and send report ? How to achieve same if you have numerous fixed flavor of servers like Linux, SunOS, AIX etc with single script.

SO, here I'm going to share you the shell script whose job is to send email report of all servers ( Linux and Solaris ) running. This script will first do ping test first after succeed gather rest of details one by one. This shell monitoring script will calculate CPU utilization (CPU 5 min , 1- min , 15 min load, CPU usr , CPU sys and Total number of Processes running ) , Memory utilization ( Total RAM, used RAM, free RAM, Total Swap, used Swap , free Swap) and the disk utilization of each servers of which you only need to watch.  one great capability of this shell script is to customize disk utilization monitoring of only location/path you are interested.

The report email send will be in below format:

Report email
Report Email



How to run the script ?

We only need one exception file and one shell script file in the same execution directory. Suppose you only interested in some specific path or directory to be get monitor then you have to put the path of location for which you don't need monitoring. To get data reports from all servers one needs to setup password less ssh key on all servers.

Exception File (exception.txt) format will  be like this : (not interested in monitoring path)

/etc/svc/volatile
/system/object
/etc/mnttab
/system/contract
/proc /unicenter
/var/run
/tmp
/dev/fd

monitoring_script.sh

This Unix Server Monitoring Shell Script only for your reference and it can be modified and optimize with other commands available. This script is tested in different environments and it's working. It may need some modification to run in your environment.


#!/bin/bash

initVar() {
#List Of hosts to be monitored
export  Q_HOST="abc.middlewaretimes.com xyz.middlewaretimes.com mole.middlewaretimes.com mule.middlewaretimes.com"

# SSH USER
export USR="u5355680"

# -- Show warning if server load average is below the limit for last 5 minute
export LOAD_WARN=15.0    #####CPU Load More than
export MEM_WARN=1    ###### Memory Less than 1Gb
export SWAP_WARN=90   ######### SWAP more than 90%

#export NETVAL=30    ########check after 30sec if a machine is down
export LOGDEST="/home/u5355680/sysRun.log"   ###Log Dest
export MFILE=/home/u5355680/info.html
export MYNETINFO="System Info for Mobistar Servers"

####MAIL####
export TOLIST="u5355680@mail.middlewaretimes.be"
export FROM="u5355680@mail.middlewaretimes.be"
export SUB="Status of middlewaretimes Servers"

# font colours
export GREEN='<font color="#008000">'
export RED='<font color="#ff0000">'
export NOC='</font>'
export LSTART='<ul><li>'
export LEND='</li></ul>'

# Local path to ssh and other bins
export SSH="/usr/bin/ssh"
export PING="/bin/ping"
export NOW="$(date)"


echo "started lookup @ $NOW"       > "$LOGDEST"
echo "server lookup list  $Q_HOST" >> "$LOGDEST"

#################### FS INIT #############
WORKFILE="df.work" # Holds filesystem data
>$WORKFILE              # Initialize to empty
THISHOST=`hostname` # Hostname of this machine
FSMAX="0"              # Max. FS percentage value
NEW_MAX="100"
EXCEPTIONS="exceptions" # Overrides $FSMAX
####################Init_done#############
export NOW="$(date)"
}
writeHead(){
echo '<HTML><HEAD><TITLE>Network Status</TITLE></HEAD>
<BODY alink="#0066ff" bgcolor="#ffffff" link="#0000ff" text="#000000" vlink="#0033ff">'  > $MFILE
echo '<CENTER><H1>'  >> $MFILE
echo "$MYNETINFO</H1>"  >> $MFILE
echo "Generated on $NOW"  >> $MFILE
echo '</CENTER>'   >> $MFILE

}
writeTable(){
echo '<TABLE WIDTH=100% BORDER=1 BORDERCOLOR="#000000" CELLPADDING=4 CELLSPACING=4 FRAME=HSIDES RULES=NONE" >'  >> $MFILE
echo '<TR VALIGN=TOP>'  >> $MFILE
echo "<tr>"  >> $MFILE
echo "<th><b>HostName</th>" >> $MFILE
echo "<th>Ping&Uptime</th>" >> $MFILE
#echo "<th>Uptime</th>" >> $MFILE
echo "<th>CPU & Total Processes</th>" >> $MFILE
echo "<th>FileSystem</th>" >> $MFILE
#echo "<th>Total Processes</th>" >> $MFILE
echo "<th>Ram + Swap</th>" >> $MFILE
echo "</tr>" >> $MFILE

}
check_exceptions (){

# set -x # Uncomment to debug this function

# Define a data file

DATA_EXCEPTIONS="dfdata.out"

# Ingore any line that begins with a pound sign, #

cat $EXCEPTIONS |  grep -v "^#" > $DATA_EXCEPTIONS

while read FSNAME # Feeding Ddata from Bottom of Loop!!!
do
        if [[ $FSNAME = $FSMOUNT ]] # Correct /mount_point?
        then    # Get rid of the % sign, if it exists!
                    return 0 # FOUND MAX OUT - Return 0
        fi

done < $DATA_EXCEPTIONS # Feed from the bottom of the loop!!

return 1 # Not found in File
}
load_LINUX_FS_data(){

#df -k | tail -n +2 | egrep -v '/cdrom|dev|copper|VNAS|home' | awk '{print $4, $5}' | sed '/^\s*$/d' > $WORKFILE
echo "df -k | tail -n +2 | egrep -v '/cdrom|dev|copper|VNAS|home' " |$_CMD "sudo su -" |awk '{print $4, $5}' | sed '/^\s*$/d' > $WORKFILE

}
load_Solaris_FS_data(){

#df -k | tail +2 | egrep -v '/dev/fd|/etc/mnttab|/proc|/cdrom'  | awk '{print $1, $5, $6}' > $WORKFILE
#$_CMD "df -k | tail +2 | egrep -v '/dev/fd|/etc/mnttab|/proc|/cdrom'" | cut -d':' -f2 |awk '{print $5, $6}' > $WORKFILE
#echo "df -k | tail +2 | egrep -v '/dev/fd|/etc/mnttab|/proc|/cdrom' " |$_CMD "sudo su -" | cut -d':' -f2|awk '{print $5, $6}'  > $WORKFILE
$_CMD "df -k | tail +2 | egrep -v '/dev/fd|/etc/mnttab|/proc|/cdrom|denied|'" | cut -d':' -f2 |awk '{print $5, $6}' > $WORKFILE

}
FS_Value(){
OUTFILE="df.outfile" # Output display file
>$OUTFILE # Initialize to empty
while read FSVALUE FSMOUNT
do
     # Strip out the % sign if it exists
     FSVALUE=$(echo $FSVALUE | sed s/\%//g) # Remove the % sign
     if [[ -s $EXCEPTIONS ]] # Do we have a non-empty file?
     then # Found it!

        # Look for the current $FSMOUNT value in the file
        # using the check_exceptions function defined above.

        check_exceptions
        if [ $? != 0 ] # Found it Exceeded!!
             then
                  echo "<li>$FSMOUNT is <b>${FSVALUE}%</b></li>" >> $OUTFILE
        fi
     else # No exceptions file use the script default
             echo "<li>$FSMOUNT is <b>${FSVALUE}%</b></li>" >> $OUTFILE
     fi
done < $WORKFILE # Feed the while loop from the bottom...
}
reportLinux(){
_CMD="$SSH $USR@$host"
rhostname="$($_CMD hostname)"
ruptime="$($_CMD uptime)"
if $(echo $ruptime | grep -E "min|days" >/dev/null); then
x=$(echo $ruptime | awk '{ print $3 $4}')
else
x=$(echo $ruptime | sed s/,//g| awk '{ print $3 " (hh:mm)"}')
fi
ruptime="$x"
rload="$($_CMD top -n 1 -b  -u $USR | head -10 | grep Cpu |  sed 's/  / /g'  |cut -d " " -f2 | cut -d "%" -f1 )"
rusage="$($_CMD top -n 1 -b  -u $USR | head -10 | grep Cpu |  sed 's/  / /g'  |cut -d " " -f5 | cut -d "%" -f1 )"
y="$(echo "$rload >= $LOAD_WARN" | bc)" > /dev/null 2>&1
[ "$y" == "1" ] && rload="$RED $rload (High) $NOC" || rload="$GREEN $rload (Ok) $NOC"
rclock="$($_CMD date +"%r")"
rtotalprocess="$($_CMD ps axue | grep -vE "^USER|grep|ps" | wc -l)"
rusedram="$($_CMD free -gto | grep Mem: | awk '{ print $3 }')"
rfreeram="$($_CMD free -gto | grep Mem: | awk '{ print $4 }')"
c="$($_CMD free -gto | grep Mem: | awk '{ print $4 }')"
y="$(echo "$rfreeram <=$MEM_WARN" | bc)" > /dev/null 2>&1
[ "$y" == "1" ] && rfreeram="$RED $rfreeram (HiGH) $NOC" || rfreeram="$GREEN $rfreeram (Ok) $NOC"
rtotalram="$($_CMD free -gto | grep Mem: | awk '{ print $2 }')"
rusedswap="$($_CMD free -t | awk '/Swap:/ {printf("%.2f\n", $3/$2*100)}')"
y="$(echo "$rusedswap >=$SWAP_WARN" | bc)"
[ "$y" == "1" ] && rusedswap="$RED $rusedswap % (HiGH) $NOC" || rusedswap="$GREEN $rusedswap (Ok) $NOC"
# didl="$($_CMD /usr/bin/sar | grep Av | tail -1  | sed 's/    / /g' | sed 's/   / /g' |  sed 's/  / /g' | cut -d " " -f8)"
# davg="$($_CMD /usr/bin/sar | grep Av | tail -1  | sed 's/    / /g' | sed 's/   / /g' |  sed 's/  / /g' | cut -d " " -f3)"
load_LINUX_FS_data
FS_Value
# sed '/^$/d' $OUTFILE > $OUTFILE
$PING -c1 $host>/dev/null
if [ "$?" != "0" ] ; then
rping="$RED Failed $NOC"
echo "</tr>"   >> $MFILE
echo "<td <b> <font color="\#ff0000"> $host</b></td>" >> $MFILE
echo "<td width="18%">" >> $MFILE
echo "      <ul>" >> $MFILE
echo "        <li>Ping status: $rping</li>" >> $MFILE
echo "      </ul>" >> $MFILE
echo "    </td>" >> $MFILE
# echo "<td>FAILED TO CONNECT</td>" >> $MFILE
echo "<td width="20%">" >> $MFILE
echo "      <ul>" >> $MFILE
echo "        <li>FAILED TO CONNECT</li>" >> $MFILE
echo "      </ul>" >> $MFILE
echo "    </td>" >> $MFILE
echo "<td width="40%">" >> $MFILE
echo "        <li>FAILED TO CONNECT</li>" >> $MFILE
echo "    </td>" >> $MFILE
echo "<td width=22%>" >> $MFILE
echo "      <ul>" >> $MFILE
echo "        <li>FAILED TO CONNECT</li>" >> $MFILE
echo "      </ul>" >> $MFILE
echo "    </td>" >> $MFILE
echo "</tr>" >> $MFILE

else
rping="$GREEN Ok $NOC"
echo "</tr>"   >> $MFILE
echo "<td <b> <font color="\#008000"> $host</b></td>" >> $MFILE
echo "<td width="18%">" >> $MFILE
echo "      <ul>" >> $MFILE
echo "        <li>Ping status: $rping</li>" >> $MFILE
echo "        <li>Uptime : $ruptime</li>" >> $MFILE
echo "      </ul>" >> $MFILE
echo "    </td>" >> $MFILE
# echo "<td>$ruptime</td>" >> $MFILE
echo "<td width="18%">" >> $MFILE
echo "      <ul>" >> $MFILE
echo "        <li>Cpu Load : $rload  </li>" >> $MFILE
echo "        <li>Cpu Idle : $rusage%</li>" >> $MFILE
echo "        <li>Tot. Process : $rtotalprocess</li>" >> $MFILE
echo "      </ul>" >> $MFILE
echo "    </td>" >> $MFILE
echo "<td width="40%">" >> $MFILE
echo "      <ul>" >> $MFILE
cat $OUTFILE >>$MFILE
echo "      </ul>" >> $MFILE
echo "    </td>" >> $MFILE
# echo "<td width=3%>$rtotalprocess</td>" >> $MFILE
echo "<td width="24"%>" >> $MFILE
echo "      <ul>" >> $MFILE
echo "        <li>Used RAM : $rusedram GB </li>" >> $MFILE
echo "        <li>Used SWAP  : $rusedswap</li>" >> $MFILE
echo "        <li>Free RAM : $rfreeram GB</li>"  >> $MFILE
echo "        <li>Total RAM : $rtotalram GB</li>" >> $MFILE
echo "      </ul>" >> $MFILE
echo "    </td>" >> $MFILE
echo "</tr>" >> $MFILE
fi
}
reportUnix(){
_CMD="$SSH $USR@$host"
rhostname="$($_CMD hostname)"
ruptime="$($_CMD uptime)"
if $(echo $ruptime | egrep -E "min|day" >/dev/null); then
x=$(echo $ruptime | awk '{ print $3 $4}')
else
x=$(echo $ruptime | sed s/,//g| awk '{ print $3 " (hh:mm)"}')
fi
ruptime="$x"
rload="$($_CMD top -n 1 -b  -u u5355680 | head -10 | grep "load averages" | awk '{print $3}' | sed s'/.$//' )"
rusage="$($_CMD top -n 1 -b  -u u5355680 | head -10 | grep "CPU states" | awk '{print $3}' | sed s'/.$//' )"
y="$(echo "$rload >= $LOAD_WARN" | bc)" > /dev/null 2>&1
[ "$y" == "1" ] && rload="$RED $rload (High) $NOC" || rload="$GREEN $rload (Ok) $NOC"
rclock="$($_CMD date +"%r")"
rtotalprocess="$($_CMD top -n 1 -b  -u u5355680 | head -10 | grep processes | awk '{print $1}' )"
# Available memory
rtotalram="$($_CMD top -n 1 -b  -u u5355680 | head -10 | grep Memory | awk '{print $2}' | sed s'/.$//')"
# rtotalram="$($_CMD `echo "scale=2; $memory/1024" | bc -l` )"
# rtotalram="$(`echo "scale=2; $memory/1024" | bc -l` )"

# Free memory
pagesize="$($_CMD pagesize )"
kb_pagesize=`echo "scale=5; $pagesize/1024" | bc -l`
sar_freemem="$($_CMD sar -r 1 1 | tail -1 | awk 'BEGIN {FS=" "} {print $2}' )"
rfreeram=`echo "scale=2; $kb_pagesize*$sar_freemem/1024/1024" | bc -l`

# Used Memory
rusedram=`echo "scale=2; $rtotalram-$rfreeram" | bc -l`

# c="$($_CMD free -gto | grep Mem: | awk '{ print $4 }')"
y="$(echo "$rfreeram <=$MEM_WARN" | bc)" > /dev/null 2>&1
[ "$y" == "1" ] && rfreeram="$RED $rfreeram (HiGH) $NOC" || rfreeram="$GREEN $rfreeram (Ok) $NOC"
# rusedswap="$($_CMD free -t | awk '/Swap:/ {printf("%.2f\n", $3/$2*100)}')"
swapinfo="$($_CMD "swap -s" )"
usedswap=`echo $swapinfo | awk '{print $9}' | sed 's/k//'`
rusedswap=`echo "scale=5; $usedswap/1024/1024"| bc -l`
swapavail=`echo $swapinfo | awk '{print $11}' | sed 's/k//'`
rswapavail=`echo "scale=5; $swapavail/1024/1024"| bc -l`
swaptotal=`echo $rusedswap+$rswapavail | bc`
swapusedpercent=`echo "scale=5; ($rusedswap/$swaptotal)*100" | bc -l`
echo "Swap utilization is at $swapusedpercent %"

y="$(echo "$rusedswap >=$SWAP_WARN" | bc)"
[ "$y" == "1" ] && rusedswap="$RED $rusedswap % (HiGH) $NOC" || rusedswap="$GREEN $rusedswap (Ok) $NOC"
# didl="$($_CMD /usr/bin/sar | grep Av | tail -1  | sed 's/    / /g' | sed 's/   / /g' |  sed 's/  / /g' | cut -d " " -f8)"
# davg="$($_CMD /usr/bin/sar | grep Av | tail -1  | sed 's/    / /g' | sed 's/   / /g' |  sed 's/  / /g' | cut -d " " -f3)"
load_Solaris_FS_data
FS_Value
$PING -c1 $host>/dev/null
if [ "$?" != "0" ] ; then
rping="$RED Failed $NOC"
echo "</tr>"   >> $MFILE
echo "<td <b> <font color="\#ff0000"> $host</b></td>" >> $MFILE
echo "<td width="18%">" >> $MFILE
echo "      <ul>" >> $MFILE
echo "        <li>Ping status: $rping</li>" >> $MFILE
echo "      </ul>" >> $MFILE
echo "    </td>" >> $MFILE
# echo "<td>FAILED TO CONNECT</td>" >> $MFILE
echo "<td width="20%">" >> $MFILE
echo "      <ul>" >> $MFILE
echo "        <li>FAILED TO CONNECT</li>" >> $MFILE
echo "      </ul>" >> $MFILE
echo "    </td>" >> $MFILE
echo "<td width="40%">" >> $MFILE
echo "        <li>FAILED TO CONNECT</li>" >> $MFILE
echo "    </td>" >> $MFILE
echo "<td width="22%">" >> $MFILE
echo "      <ul>" >> $MFILE
echo "        <li>FAILED TO CONNECT</li>" >> $MFILE
echo "      </ul>" >> $MFILE
echo "    </td>" >> $MFILE
echo "</tr>" >> $MFILE

else
rping="$GREEN Ok $NOC"
echo "</tr>"   >> $MFILE
echo "<td <b> <font color="\#008000"> $host</b></td>" >> $MFILE
echo "<td width="18%">" >> $MFILE
echo "      <ul>" >> $MFILE
echo "        <li>Ping status: $rping</li>" >> $MFILE
echo "        <li>Uptime : $ruptime</li>" >> $MFILE
echo "      </ul>" >> $MFILE
echo "    </td>" >> $MFILE
# echo "<td>$ruptime</td>" >> $MFILE
echo "<td width="18%">" >> $MFILE
echo "      <ul>" >> $MFILE
echo "        <li>Cpu Load : $rload  </li>" >> $MFILE
echo "        <li>Cpu Idle : $rusage%</li>" >> $MFILE
echo "        <li>Tot. Process : $rtotalprocess</li>" >> $MFILE
echo "      </ul>" >> $MFILE
echo "    </td>" >> $MFILE
echo "<td width="40%">" >> $MFILE
echo "      <ul>" >> $MFILE
cat $OUTFILE >>$MFILE
echo "      </ul>" >> $MFILE
echo "    </td>" >> $MFILE
# echo "<td width=3%>$rtotalprocess</td>" >> $MFILE
echo "<td width="24%">" >> $MFILE
echo "      <ul>" >> $MFILE
echo "        <li>Used RAM : $rusedram GB </li>" >> $MFILE
echo "        <li>Free RAM : $rfreeram GB</li>"  >> $MFILE
echo "        <li>Total RAM : $rtotalram GB</li>" >> $MFILE
echo "        <li>Used SWAP  : $rusedswap GB</li>" >> $MFILE
echo "        <li>Free SWAP  : $rswapavail GB</li>" >> $MFILE
echo "        <li>Total SWAP  : $swaptotal GB</li>" >> $MFILE
echo "      </ul>" >> $MFILE
echo "    </td>" >> $MFILE
echo "</tr>" >> $MFILE
fi
}
main() {
writeHead
writeTable
for host in $Q_HOST
do
_CMD="$SSH $USR@$host"
runame="$($_CMD uname)"
# This next case statement executes the correct ping # command based on the Unix flavor
case $runame in
AIX|OpenBSD|Linux)
reportLinux
;;
SunOS)
reportUnix
;;
 *)
echo "\nERROR: Unsupported Operating System - $(uname)"
echo "\n\t. . .EXITING. . .\n"
exit 1
esac

done
echo "</tr></table>" >> $MFILE
echo "</BODY></HTML>" >> $MFILE

if grep "Ping status" $MFILE
then
SFLAG=0
else
    SFLAG=1
fi

if [[ "$SFLAG" -eq "0" ]] ; then
 
echo "All servers are running Fine on $(date)" >> "$LOGDEST"

fi
}
initVar
main

mail -s "$(echo -e "$SUB\nContent-Type: text/html")" $TOLIST < $MFILE


So lets try this in you environment and let me know if any questions. 

0 Comments