Sunday, May 6, 2012

bash: simple script to send alerts if service is down

Here's a simple shell script I cooked up to alert me if my REST service went down for any reason. This was never meant for a production system (where we have nagios and other robust alerting mechanisms) but for an integration/sandbox environment. An external team (subcontractor) was integrating with our REST service deployed on this sandbox and we couldn't afford to have downtime, or the team's productivity would suffer.

It worked like a charm, took just 30 mins to write, test and deploy. Since the external REST service was deployed on a public IP [otherwise how could the external team reach it - we couldn't allow them to VPN in], this script could potentially run from any machine anywhere.

Saving this script here since it was so simple and useful.


#!/bin/bash
#
# sandbox_sanity_check.sh: a simple utility to alert relevant folks if our integration (i.e. sandbox-ext) Trybe Service goes down
# sandbox-ext is in the /etc/hosts file
targetbox=sandbox-ext
# who should we send alert mails to?
recipients="ambar@xyz.com, rakesh@xyz.com, sandeep@xyz.com"
while [ 1 ] 
do
    logfile=sandbox.`date '+%A'`.log

    curl --silent "http://${targetbox}/trybe/v1/config/TEST_067e6162-3b6f.2L_20k_60k?uid=%7B%22aid%22:%22889835751ebf3e49%22%7D&api_key=shared_key&api_nonce=8nk9pbnhacfvgc&api_ts=1333042920376&channel_id=1&api_sig=aba00fdd0058e00111b286c6356f2a70" | grep "trialConfig"

    if [ $? -eq 0 ] 
    then
        echo "[INFO] [`date '+%d_%m_%Y_%H-%M-%S'`] getConfig succeeded " | tee -a ${logfile} ; echo;  
    else
        echo "[ERROR] [`date '+%d_%m_%Y_%H-%M-%S'`] getConfig FAILED... here is the curl output:" | tee -a ${logfile} ; echo; 
        curl -v "http://${targetbox}/trybe/v1/config/TEST_067e6162-3b6f.2L_20k_60k?uid=%7B%22aid%22:%22889835751ebf3e49%22%7D&api_key=shared_key&api_nonce=8nk9pbnhacfvgc&api_ts=1333042920376&channel_id=1&api_sig=aba00fdd0058e00111b286c6356f2a70" >> ${logfile} 2>&1
        echo | tee -a ${logfile};
# send alert email to $recipients using good ol' mutt
        mutt -s "[sandbox checker]: getConfig FAILED" ${recipients}  < /var/local/sandbox.mail.message 
    fi  

    sleep 30
done



No comments: