What's the best way to schedule a command to run, assuring a previous set of commands have completed?

Question

Namely, I need to rotate backup folders. I have many machines on a schedule to rsync to a single backup machine. Although I schedule the backups to begin in the late evening, and schedule a folder rotation (to make folder day0 become day1, starting with the oldest) in the late morning the next day (providing, say, 10 hours for the backups to complete), I'd like to be able to assure all backups have completed before allowing the rotation to begin rather than making an assumption (because, if I rotate the folders while a backup is in progress, my backup is inaccurate).

This would be trivial for a single machine, but for several, I'm hoping someone knows the best method... I can think of a few but would prefer not to have to 'experiment' on running systems:

Have each backup create a completion stamp, and run the rotation script every few minutes after a certain time, checking that it hasn't already run successfully and that all stamps are current (older than last rotation script stamp)?

Have each backup mv their previous rsync to an in-progress folder, rsync, then mv back to day0 so the rotation just skips that backup if it's not complete?

Just live with potentially inaccurate backups?

Possibly better to change the folder naming convention. Rather than have day0, day1 ... dayN where the number N is "days ago" maybe better to have day_YYYYMMDD format (eg. dest="day_$(date +%Y%m%d)"). You could possibly do both by using symlinks day0 day1 ... dayN to point to the date-based folder names, and changing the symlinks instead of the actual directories. — JeffG, Mar 16 '12 at 18:16

score 5 · Answer 1 · answered Mar 16 '12 at 17:54

5

I would probably write a completion file to the central server that includes the date and hostname; you could use this:

#!/bin/bash
# when each backup completes, write a completion file:
ssh user@central-server "touch /path/to/completion-files/$HOST-$(date +%F).complete"

And on the central server:

#!/bin/bash
# on the central server, run this before attempting folder rotation
for h in (list of hosts); do
  if [[ -e "/path/to/completion-files/$h-$(date +%F)" ]]; 
    then # do your thing
  fi
done

answered Mar 16 '12 at 17:54

adaptr

16,576
23
34

Yep. This is basically the right way. Write the second step as a poller, that just starts checking ever 5 minutes after 10 hours if all the completes are in place. If you want to fancy it up a bit more, have your primary processes write "$HOST.HOLD" files that it renames to "$HOST.Complete" files if all is good. Then your poller can block out on Holds and move forward when all holds are released. – Mark Mar 16 '12 at 21:21

score 1 · Answer 2 · answered Mar 16 '12 at 18:08

To attack the more general case, which is "What's the best way to schedule a command to run, assuring a previous set of commands have completed?"

you need to run a command, and test the success (exit code) then you can schedule the command using the system scheduler (typically at). E.g:

 #!/bin/bash
 rsync "${opts[@]}" "$source" "$dest"
 if [[ $? -eq 0 ]]; then
   at now+10hours <<<"~jdoe/bin/rotatefolders.sh"
 fi

I assumed here that your folder rotation script was named rotatefolders.sh in bin in jdoe's home folder.

score 0 · Answer 3 · answered Mar 16 '12 at 21:11

0

Since you are using rsync I assume you do this via an ssh-tunnel. If so, you have ssh in place as well.

Instead of a busy wait loop it is better to use signalling when the rsync-job has finished.

The "signal" could trigger the log-rotate on the backup.server system - either by directly starting the logrotate, or by starting it indirectly (via sudo or ssh-key).

answered Mar 16 '12 at 21:11

Nils

7,695
3
34
73

example code? can you provide an example? – JeffG Mar 17 '12 at 00:38
`rsync -e ssh -auHS $SOURCEDIR ${TARGET_USER}@${BAKCUP_SERVER}:$TARGETDIR && ssh -i $SPECIAL_ID $BACKUP_SERVER` On backup server a connection with ssh-special id could trigger (via **~/.ssh/authorized_keys** command="sudo /usr/sbin/logrotate $CFGFILE" a logrotate of the directory. – Nils Mar 17 '12 at 21:13
Ah.. So the "signal" isn't actually a signal -- It's an SSH forced command. – JeffG Mar 18 '12 at 18:10
@JeffG - either that or anything else. One could work with real signals as well - with a running post-backup-process that waits for a certain signal and then looks at a specific state-file... – Nils Mar 18 '12 at 20:12

score -1 · Answer 4 · answered Mar 16 '12 at 20:06

You can use the the pid file:

PID_FILE="/path/to/pid.pid"   
    if [ -f $PID_FILE ]
    then
            OLD_PID=`cat $PID_FILE`
            RUNNING=`ps aux |grep $OLD_PID|grep -v grep|wc -l`
            if [ $RUNNING -gt 0 ]
            then
                    echo "WARNING PROGRAM already running"
                    exit 0
            else
                    echo "PID file exists but program is not running. Overriding PID file"
            fi
    fi
    echo $$ > $PID_FILE
    trap "rm -f $PID_FILE; exit" INT TERM EXIT

Sharing the pid file, you will assure B won't run until A finishes and the opposite too.

Pity I can't deduct more than one point :( So many bash errors here... — adaptr, Mar 18 '12 at 10:28

What's the best way to schedule a command to run, assuring a previous set of commands have completed?

4 Answers4