This script can monitor GoldenGate lag whenever it happens based on the per-defined LAG threshold inside the script.
It’s highly recommended to deploy this script on all (source & destination) replication servers in order to detect the lag on all processes (Extract, Pump, and Replicate).
This script is not designed to monitor the replicated data inside the tables it totally relies on the native GoldenGate GGSCI console.
This script should be executed/scheduled by the GoldenGate installation owner OS user.
How it works:
First, Download the script:
https://www.dropbox.com/s/l4dqzicviuaawt6/goldengate_lag_mon.sh?dl=0
Second, Adjust the following parameters:
MAIL_LIST=”youremail@yourcompany.com“
Replace “youremail@yourcompany.com” pattern with your e-mail.
# ###########################################
# Mandatory Parameters To Be Set By The User:
# ###########################################
ORACLE_HOME= # ORACLE_HOME path of the database where GoldenGate is running against.
GG_HOME= # GoldenGate Installation Home path. e.g. GG_HOME=/goldengate/gghome
Please note that ORACLE_HOME & GG_HOME are mandatory to be adjusted by YOU, in case you missed setting them up, the script will automatically try to guess the right values, but this will not be accurate most of the times.
# ################
# Script Settings:
# ################
# LAG THRESHOLD in minutes: [If reached an e-mail alert will be sent. Default 10 minutes]
LAG_IN_MINUTES=10
Here you define the LAG threshold in minutes (it’s 10 minutes by default). Whereas if the lag reached 10 minutes it will send you an email.
# Excluded Specific PROCESSES NAME:
# e.g. If you want to exclude two replicate processes with names REP_11 and REP_12 from being reported then add them to below parameter as shown:
# EXL_PROC_NAME=”DONOTREMOVE|REP_11|REP_12″
EXL_PROC_NAME=“DONOTREMOVE“
In case you want to exclude specific (Extract, Pump, or Replicat) processes, let’s say you want to exclude a process you use it for testing the replication, you can add it to the above parameter as shown in the blue color example.
DISCLAIMER: THIS SCRIPT IS DISTRIBUTED IN THE HOPE THAT IT WILL BE USEFUL BUT WITHOUT ANY WARRANTY. IT IS PROVIDED “AS IS”.
GitHub version:
I have two Goldengate installations on same server. Does it work for both Goldengate installations i.e, two different Goldengate HOME
Looks I missed out your reply, Apologize for that.
Actually, I don't have a test environment with the same scenario you have, but as an easy approach, I recommend to schedule two versions of the script in the crontab, each script will point to a different GoldenGate Home installation.
Assalam valaikum Abdel I tried your script but not working
Script is running fine on Linux but not on Solaris. Pls advise. On Solaris I not receiving email when replicat is shtopped.
Actually the script is designed for Linux, I never tested it on the other platforms.
Would you mind posting the error/problem you are receiving.
Hi Mohammoud , Am not good in scripting . I tried your script it is working in linux but the problem is am 3 mails in the same time and exclude parameter is not working . By default it is sending details of all the process . I wanted remove some process names and lag should be greater than 30 mints . Can check what iam missing in this script . Sugesstion : If you can add header for the process and status would be good to understand
=======
Script i modified using ur script :
===================================
[oracle] taxqn1pporadb08:cat gglag.sh
#!/bin/bash
set -x
MAIL_LIST="svedachalamsundaram@corelogic.com"
SERVER_NAME=`uname -n`
export SERVER_NAME
# ###########################################
# Mandatory Parameters To Be Set By The User:
# ###########################################
ORACLE_HOME=/apps/oracle/product/11.2.0.4/db_1 # ORACLE_HOME path of the database where GoldenGate is running against.
GG_HOME=/ora_backup/ggate/12.2 # GoldenGate Installation Home path. e.g. GG_HOME=/goldengate/gghome
# ################
# Script Settings:
# ################
# LAG THRESHOLD in minutes: [If reached an e-mail alert will be sent. Default 10 minutes]
LAG_IN_MINUTES=2
# Excluded Specific PROCESSES NAME:
# e.g. If you want to exclude two replicate processes with names REP_11 and REP_12 from being reported then add them to below parameter as shown:
# EXL_PROC_NAME="DONOTREMOVE|REP_11|REP_12"
EXL_PROC_NAME="DONOTREMOVE|EPASAUD|RPASAUD1|RPASAUD2|RPASAUD3|RPASAUD4|RPASAUD5|RPASAUD6|RPASAUD7|RPASAUD8|RPASAUD9"
#EXL_PROC_NAME="DONOTREMOVE|RCLGL|RLASP1|RLASP2"
# ###############
# VARIABLES:
# ###############
LOG_DIRECTORY=/export/home/oracle/dbascripts/dba # Log Location
LAG=$LAG_IN_MINUTES
#LAG=$((LAG_IN_MINUTES * 100))
export LAG
export EXL_PROC_NAME=$EXL_PROC_NAME
export LD_LIBRARY_PATH=${ORACLE_HOME}/lib
echo LD_LIBRARY_PATH is: $LD_LIBRARY_PATH
# ################################################
# Checking the LAG status from Goldengate Console:
# ################################################
for GREP_SERVICE in EXTRACT REPLICAT
do
export GREP_SERVICE
export LOG_DIR=${LOG_DIRECTORY}
export LOG_FILE=${LOG_DIR}/${GREP_SERVICE}_lag_mon.log
# Identify lagging operation name:
case ${GREP_SERVICE} in
"REPLICAT") LAST_COL_OPNAME="RECEIVING"
export LAST_COL_OPNAME
BFR_LAST_COL_OPNAME="APPLYING"
export BFR_LAST_COL_OPNAME
;;
"EXTRACT") LAST_COL_OPNAME="SENDING"
export LAST_COL_OPNAME
BFR_LAST_COL_OPNAME="EXTRACTING"
export BFR_LAST_COL_OPNAME
;;
esac
$GG_HOME/ggsci << EOF |grep "${GREP_SERVICE}" > ${LOG_FILE}
info all
exit
EOF
# ################################
# Email Notification if LAG Found:
# ################################
for i in `cat ${LOG_FILE}|egrep -v ${EXL_PROC_NAME}|awk '{print $NF}'|sed -e 's/://g'`
do
if [ $i -ge ${LAG} ]
then
mail -s "Goldengate LAG detected in ${LAST_COL_OPNAME} TRAIL FILES on Server [${SERVER_NAME}]" ${MAIL_LIST} < ${LOG_FILE}
#echo "Goldengate LAG detected in ${LAST_COL_OPNAME} TRAIL FILES on Server [${SERVER_NAME}]"
fi
done
done
# #############
# END OF SCRIPT
###############
For a lag greater than 30 min set this parameter:
LAG_IN_MINUTES=31
for excluding processes from being reported I can see you are doing it right, but remember they will not trigger the alarm if the are lagged of OFF, but they will be still seen in the Email body:
EXL_PROC_NAME="DONOTREMOVE|EPASAUD|RPASAUD1|RPASAUD2|RPASAUD3|RPASAUD4|RPASAUD5|RPASAUD6|RPASAUD7|RPASAUD8|RPASAUD9"