voici une mise à jour du script MULTI_PROC.sh
ce script, permet de paralléliser le lancement de scripts.
attention, ce script a été conçu pour fonctionner depuis Solaris uniquement.
n’oubliez pas de changer les chemins à différents binaires pour un autre OS.
voici le script de parallélisation
#!/bin/ksh # vim:tabstop=3:syntax=sh:foldmethod=marker # # --------- ## Programme : if_error ## Version : 1.6 ## Objet : paralellise le lancement d'un script au travers d'une fenetre ## dtterm et analyse le code retour. ## Author : Cedrick GAILLARD ## Email : mobidyc @ gmail.com ##---------------------------------------------------------------------------- # 0.1 00-mar-09 - Creation # 0.9 17-avr-09 - ajout de la limite de process lances en parallelisation. # - ajout de la possibilite de paralelliser un proramme donne en # ligne de commande. # - possibilite de variabiliser un composant du script donne en # ligne de commande. # - possibilite de ne pas exporter la variable ARGUMENT_TO_WORK. # - ajout de l'usage et d'exemples. # - ajout de controle d'erreurs. # - garde maintenant un log de tout ce qui se passe concernant # l'argument concerne en cas d'erreur. # - ajout d'un resume des erreurs en fin de programme. # 1.0 24-avr-09 - suppression du downtime nagios, mis dans le script d'arret. # 1.1 11-jun-09 - Modification d'un message d'erreur # 1.2 18-jun-09 - si limite n'est pas specifie, positionnement par defaut a 10 # ajout d'un parseur qui recherche les RVAL dans le script a # - modification du repertoire ou sont deposes les fichier .RVAL # - lancer pour les afficher avant le lancement du script. # 1.3 09-sep-09 - ajout du chemin complet vers le binaire ps. # 1.5 13-sep-10 - utilise maintenant le shell courant en tache de fond pour # travailler, plus besoin d'exporter un DISPLAY. # ajout de l'argument --dtterm pour utiliser dtterm. # 1.6 09-mar-11 - modification du check display si ddterm n'est pas utilis�. # #----------------------------------------------------------------------------- Usage () { #{{{ cat <<EOF Usage: ${0} (-h | --help) (--dtterm) -e PROGRAM (-l LIMIT) (-n) [-f FILE ] [Argument1 ( ... ArgumentX ... )]" -h Display usage --help Display usage -e PROGRAM Paralellize the program PROGRAM -l LIMIT limit the paralellization to this value -n Does not export the \$ARGUMENT_TO_WORK variable to PROGRAM --dtterm use dtterm windows instead of the terminal -x VAR_NAME Substitue each VAR_NAME occurences by arguments -f FILE FILE contains an argument list PROGRAM must be only one program, eventually followed by args. in that case, you must quote it. Utility: ${0} will run multiple forks of PROGRAM. there will be as many fork as arguments, each fork can use the \$ARGUMENT_TO_WORK variable, which is the current argument. The first utility for this script was to paralellize a script, which takes a server name in argument, establish a connection to this server an run some commands, need a complete log and a resume in case of errors. but this script can deserve you a lot of other needs. Example: 6 forks and a maximum of 4 simultaneous run of xterm, each xterm will have the current argument in title # ${0} -l 4 -e "xterm -title FOOBAR -e sleep 2" -x FOOBAR A B C D E F Example: # SERVER_LIST="machineA machineB machineB.dev.par.emea.cib" # ${0} -l 2 -e "ssh -o StrictHostKeyChecking=no FOOBAR uname -n" -x FOOBAR \$SERVER_LIST machineB : En erreur machineA : finished machineB.dev.par.emea.cib : Fait! Resume des erreurs : machineB : Erreur, Rval = 255 - voir le fichier /var/tmp/machineB.255.RVAL # cat machineB.RVAL ssh: machineB: node name or service name not known 255 Example: using the --dtterm permits you oo interact with each telnet session SERVER_LIST="machineA machineB machineB.dev.par.emea.cib" # ${0} --dtterm -e "telnet TOTOTO-lc.adm" -x TOTOTO -l 4 \$SERVER_LIST EOF } #}}} #SIZE="-geometry 100x10" export SIZE= [ -z "$1" ] && { #{{{ echo "${0} (-h | --help) (--dtterm) -e PROGRAM (-l LIMIT) (-n) [-f FILE ] [Argument1 ( ... ArgumentX ... )]" echo echo "to mass run a script on multiple servers :" echo "$0 -l 10 -e '${0%/*}/stop_server.sh' -f /tmp/server_liste.txt" echo "$0 -l 10 -e '${0%/*}/stop_server.sh' Server1 [ Server2 ... ServerX ]" echo "$0 -l 10 -e '${0%/*}/stop_server.sh' Server1 [ Server2 ... ServerX ]" echo "$0 -l 10 -e '${0%/*}/restart_server.sh' Serveur1 [ Serveur2 ... ServeurX ]" echo "$0 -e '${0%/*}/restart_server.sh' Serveur1 [ Serveur2 ... ServeurX ]" echo "$0 -e '${0%/*}/mass_downtime.sh' Serveur1 [ Serveur2 ... ServeurX ]" echo echo "To display detailed help :" echo "$0 -h" } #}}} stop_prg () { #{{{ [ -n "$SERV_ERR" ] && { echo echo "Errors detected :" for SERV_IN_ERR in $SERV_ERR do Serv="${SERV_IN_ERR%:*}" Rval="${SERV_IN_ERR#*:}" if [ "$Rval" = "BAD" ] then echo "$Serv : Unknown error" elif [ -n "$(echo $Rval |sed 's/[0-9]//g')" ] then echo "$Serv : look at the ${TMPRVAL}/${Serv}.RVAL file" else echo "$Serv : ERROR: Rval=$Rval - look at the ${TMPRVAL}/${Serv}.${Rval}.RVAL file" fi done } if [ "$DEBUG" = "true" ] then echo "Logs: $TMPDIR" else rm -rf "$TMPDIR" fi exit } #}}} trap stop_prg 1 2 3 11 13 15 DEBUG= echo -- " --> Analysing arguments" while [ "$#" -ne "0" ] #{{{ do case $1 in -e) COMMANDS=$2 shift 2 ;; -h|--help) \ Usage exit ;; -l) LIMIT=$2 shift 2 ;; -n) VAR_EXPORT=false shift ;; --dtterm) V_WINTERM=true shift ;; -x) VAR_TO_USE=$2 shift 2 ;; -f) ARGUMENTS="$ARGUMENTS $(< $2)" shift 2 ;; *) ARGUMENTS="$ARGUMENTS $1" shift ;; esac done #}}} echo -- " --> End of analysing arguments" RACINE="$(cd ${0%/*} && pwd)" || { echo "ERROR: can't get my racine directory" exit 1 } [ -z "$DISPLAY" -a "$V_WINTERM" = "true" ] && { echo "ERREUR : your DISPLAY is not set" exit 1 } DTTERM="/usr/dt/bin/dtterm" V_DATE="$(date +%Y%m%d-%Hh%Ms%S)" cd $RACINE # default simultaneous limit [ -z "$LIMIT" -o "$LIMIT" -lt "1" ] && LIMIT="10" TMPDIR="/var/tmp/${0##*/}.${RANDOM}.${$}" [ -d "${TMPDIR}" ] && { echo "ERROR : The temporary folder alreacy exists : ${TMPDIR}" exit 1 } mkdir -p "${TMPDIR}" TMPRVAL="/var/tmp" [ -d "${TMPRVAL}" ] || mkdir -p "${TMPRVAL}" # if PROGRAM has defined RVAL in header, we display them egrep "^# RVAL - [0-9]*[0-9] - " "$(echo ${COMMANDS} |awk '{print $1}')" 2>/dev/null echo [ -z "$COMMANDS" ] && { echo "no command found to execute!" exit 1 } run_thread () { #{{{ ARGUMENT_TO_WORK=$1 [ "$VAR_EXPORT" != "false" ] && export ARGUMENT_TO_WORK if [ -n "$VAR_TO_USE" ] then COMMAND="$(echo "$COMMANDS" |sed "s@$VAR_TO_USE@$ARGUMENT_TO_WORK@g")" else COMMAND="$COMMANDS" fi echo "#! /bin/ksh" > "${TMPDIR}/${ARGUMENT_TO_WORK}.sh" echo "$COMMAND" >> "${TMPDIR}/${ARGUMENT_TO_WORK}.sh" echo "echo \$?" >> "${TMPDIR}/${ARGUMENT_TO_WORK}.sh" echo "exit" >> "${TMPDIR}/${ARGUMENT_TO_WORK}.sh" chmod +x "${TMPDIR}/${ARGUMENT_TO_WORK}.sh" cp /dev/null ${TMPRVAL}/${ARGUMENT_TO_WORK}.RVAL if [ "$V_WINTERM" = "true" ] then ${DTTERM} -kshMode $SIZE \ -title ${ARGUMENT_TO_WORK} \ -l -lf ${TMPRVAL}/${ARGUMENT_TO_WORK}.RVAL -geometry 100x10 \ -e ${TMPDIR}/${ARGUMENT_TO_WORK}.sh & else (exec ${TMPDIR}/${ARGUMENT_TO_WORK}.sh ${ARGUMENT_TO_WORK}) > ${TMPRVAL}/${ARGUMENT_TO_WORK}.RVAL 2>&1 & fi PIDS="$PIDS $ARGUMENT_TO_WORK:$! " CURRENT="$(( $CURRENT + 1 ))" } #}}} check_thread () { #{{{ SHIFT_TO_DO=0 for PID in ${PIDS} do # If the PID is still running, go next [ -n "$(/bin/ps -e -o pid |grep "^[ ]*${PID#*:}$")" ] && continue # from here, the process is finished, we can restart another [ -n "$1" ] && { run_thread $1 shift 1 SHIFT_TO_DO=$(( $SHIFT_TO_DO + 1 )) } INDEX="$(( $INDEX + 1 ))" CURRENT="$(( $CURRENT - 1 ))" RVAL="$(tail -1 ${TMPRVAL}/${PID%:*}.RVAL)" [ "$V_WINTERM" = "true" ] && RVAL="$(echo "$RVAL" |awk '{print $1}' |sed 's///')" if [ "$RVAL" = "0" ] then echo "${PID%:*} (${INDEX}/${NBR_ARGS} - $(echo "${INDEX} ${OPER}" |/usr/bin/bc)%) : finished" rm -f ${TMPRVAL}/${PID%:*}.RVAL else mv "${TMPRVAL}/${PID%:*}.RVAL" "${TMPRVAL}/${PID%:*}.${RVAL}.RVAL" echo "${PID%:*} (${INDEX}/${NBR_ARGS} - $(echo "${INDEX} ${OPER}" |/usr/bin/bc)%) : ERROR: Rval=$RVAL - take a look at the ${TMPRVAL}/${PID%:*}.${RVAL}.RVAL file" SERV_ERR="$SERV_ERR ${PID%:*}:${RVAL}" fi PIDS="$(echo "${PIDS}" |sed -e "s@\<${PID}\>@@" -e 's@ @ @g')" done } #}}} set -- $ARGUMENTS NBR_ARGS=${#} if [ "$NBR_ARGS" -eq "100" ] then OPER="" elif [ "$NBR_ARGS" -lt "100" ] then OPER="* $(echo "scale=2 ; 100 / ${NBR_ARGS}" |bc -l)" else OPER="/ $(echo "scale=2 ; ${NBR_ARGS} / 100" |bc -l)" fi INDEX=0 CURRENT=0 LOOP=0 echo -- " --> We have ${NBR_ARGS} Args to run" echo -- " --> LIMIT defined to ${LIMIT}" while [ : ] do [ "${#}" -ne "0" -a "$CURRENT" -le "$LIMIT" ] && { run_thread $1 shift 1 } NBR_PIDS=$(echo "${PIDS}" |wc -w) [ "${NBR_PIDS}" -eq "0" ] && break [ "${#}" -eq "0" ] && /usr/bin/perl -e 'select(undef,undef,undef,.1)' [ "$CURRENT" -eq "${LIMIT}" ] && /usr/bin/perl -e 'select(undef,undef,undef,.1)' [ "${#}" -gt "${LIMIT}" -a "${NBR_PIDS}" -lt "$(echo "${LIMIT} / 2" |/usr/bin/bc)" ] && { while [ "${#}" -ne "0" -a "$CURRENT" -lt "$LIMIT" ] do run_thread $1 shift 1 done } check_thread ${*} [ "$SHIFT_TO_DO" -gt "0" ] && shift $SHIFT_TO_DO done stop_prg
et voici 1 exemple de script a donner en argument à MULTI_PROC.sh
ce dernier permet d’arrêter un serveur Sun, Linux ou HP-UX
#!/bin/ksh # ce programme est cense etre lance par le programme multi_remote.sh # # codes retour: # RVAL - 150 - I didn't received the server name # RVAL - 200 - you must modify the nagios downtime end time # RVAL - 170 - the first disk sync is failed # RVAL - 172 - the second disk sync is failed # RVAL - 212 - ping HS [ -z "$ARGUMENT_TO_WORK" ] && { echo "ERROR : the ARGUMENT_TO_WORK variable is not set" exit 150 } /usr/sbin/ping $ARGUMENT_TO_WORK 2 >/dev/null 2>&1 if [ "$?" = "0" ] then /usr/bin/ssh -q -o PreferredAuthentications=publickey -o StrictHostKeyChecking=no -n ${ARGUMENT_TO_WORK} "PATH=\$PATH:/bin:/sbin:/usr/bin:/usr/sbin export PATH case \`uname -s\` in SunOS) eeprom \"auto-boot?\"=true || exit 4 sync || exit 170 sync || exit 172 init 5 2>&1 || exit 3 ;; Linux) shutdown -h now 2>&1 || exit 3 ;; HP-UX) cd / setboot -b on /usr/sbin/shutdown -h -y now |while read line do echo \"\${line}\" [ -n \"\$(echo \"x\${line}\" |grep \"xTransition to run-level 0 is complete\")\" ] && exit 0 done ;; *) echo \"\`uname -s\` non supporte\" exit 1 ;; esac " REMOTE_VAL=$? exit $REMOTE_VAL else echo "ERROR: can't ping $ARGUMENT_TO_WORK" exit 212 fi
explication du fonctionnement:
# ./MULTI_PROC.sh -e './stop_server.sh' -f /tmp/server_liste.txt
le script ‘stop_server.sh’ va être lancé autant de fois que le nombre de serveurs à traiter dans le fichier /tmp/server_liste.txt.
et pour chaque lancement, stop_server.sh va récupérer l’argument (le nom du serveur) en cours sous la forme de la variable ARGUMENT_TO_WORK.
stop_server.sh est donc simplement un script qui va se connecter en ssh sur le serveur $ARGUMENT_TO_WORK et l’arrêter.
il est conseillé de gérer les erreurs dans le script qui va être parallélisé en sortant avec un numéro d’erreur prédéfini cat MULTI_PROC.sh sait gérer les codes retour, et garde l’historique de tout le script si le code retour n’est pas zéro.
tout au long du déroulement du programme, à chaque fois qu’un thread se termine, ça s’affiche à l’écran, avec un rapport sur l’évolution de la totalité du traitement.
à chaque fois qu’une erreur est détectée, elle s’affiche et est logguée.
lorsque tout les traitements sont terminés, un résumé de toutes les erreurs rencontrées s’affiche.
exemple de sortie:
# ./MULTI_PROCESS.dev.sh -e './test_ssh_cnx.sh' -f /tmp/list_of_3_servers.txt -- --> Analysing arguments -- --> End of analysing arguments # RVAL - 150 - I didn't received the server name # RVAL - 212 - ping HS -- --> We have 3 Args to run -- --> LIMIT defined to 10 machine_1 (1/3 - 33.33%) : ERROR: Rval=212 - take a look at the /var/tmp/machine_1.212.RVAL file machine_3 (2/3 - 66.66%) : ERROR: Rval=212 - take a look at the /var/tmp/machine_3.212.RVAL file machine_2 (3/3 - 99.99%) : finished Errors detected : machine_1 : ERROR: Rval=212 - look at the /var/tmp/machine_1.212.RVAL file machine_3 : ERROR: Rval=212 - look at the /var/tmp/machine_3.212.RVAL file #