I have a bash script that runs via cron every night. It's a software package repository mirroring process, so it has the potential to run for a long time. I don't want two of them running at the same time, and I want to know if one is running for more than 24 hours.
So the first thing the script does is to check for the presence of a lock file in a particular location. If the lock file exists, that means that the previous instance didn't complete, so it spits out an error message (which will land in my email) and quits. If the script does not find the lock file (which is the normal condition), it creates the lock file (via touch) on the very next line. The very last thing the script does is to remove the lock file.
if [ -f /var/lock/subsys/mirror ] ; then echo 'previous instance running--quitting' exit fi touch /var/lock/subsys/mirror # long-running mirroring process... rm -f /var/lock/subsys/mirror
We have scheduled downtime from time to time to do operating system upgrades and other maintenance. I typically create the lock file manually at the beginning of maintenance, because I don't want my local repository changing while I'm running updates. And then I manually remove the lock file at the end of maintenance.
We did maintenance last night, and the server that runs this mirroring process was having some kind of problem. We ended up rebooting the server (which, in this case, resolved the problem we were experiencing).
Turns out that part of the reboot process on Red Hat (and CentOS) is to clear out the /var/lock/subsys directory, which is where I'd been putting the lock file for the mirroring process. Oops.
Fortunately, the reboot happened after the cron job, and so the cron job didn't run unexpectedly. But that was just a matter of lucky timing.
So today I'm moving the lock file to /var/run. And I may as well make it work like the other files in that directory. So instead of
touch /var/lock/subsys/mirror
I'll do
echo $$ > /var/run/mirror.pid
No comments:
Post a Comment