0

I have a script that runs every hour, and without using a database, I would like to check if that file is either: (1) currently running; (2) already completed; or (3) not yet run. If it's #3, then I run it, otherwise I skip it. What would be a good way to track this outside of a database? For example, I was thinking:

MyApplicationFolder/
  script.py
  proc/
    running/
      $pid_$integrity_field
    completed/
       $integrity_field

In this way, when I run the script I could:

  • check to see if it's currently running (if integrity_field in running/*). And if it's already running I can grab the processID (in case I need to send a signal to it).
  • check to see if it's already completed (if integrity_field in completed/*).

Is this something like a standard approach to doing this, or what might be a better or more standardized way to do this approach (again, not using a DB to track it)?

1 Answers1

2

(1) currently running; (2) already completed; or (3) not yet run.

If your scheduler is good enough (see cron(8) and systemd.timer(5)), and your script efficient enough, I think you don't need to check if it's already run at current hour, except of course if your script could run for more than one hour.

Is this something like a standard approach to doing this

You can use lockfile(1) to do that, or maybe py-filelock if you want to do it in your Python script instead in your bash command.

(3) not yet run

lockfile(1)'s default behaviour is to retry to acquire lock every 8 seconds, so your script can wait, run, and then delete the lock file, if your script hasn't run for the current hour.

mforsetti
  • 2,666
  • 2
  • 16
  • 20
  • thanks for this, a few questions: (1) it is a database process so could take over an hour (or even occasionally get a deadlock and never finish, in which case it'd need to be manually killed); (2) how to track when an item has finished though? The `lockfile` cron approach seems to be good to run every `*interval*`, but how would we keep track of when a script has completed or not? – samuelbrody1249 May 03 '20 at 19:40
  • (1) okay, but please note that `lockfile(1)` creates a lock file with 444 permission, and without deleting its lock file first, you'll risk deadlock from `lockfile(1)`. (2) if you do something like `LF=$(date +%Y%m%d-%H); lockfile /tmp/script-$LF.lock; python /path/to/script.py; rm -r /tmp/script-$LF.lock && touch /tmp/script-$LF.done`, then you can simply do `[ -f "/tmp/script-$(date +%Y%m%d-%H).lock" ] && echo "running"` to check if your process is still running, and `[ -f "/tmp/script-$(date +%Y%m%d-%H).done" ] && echo "done"` to check if your process is done. – mforsetti May 04 '20 at 06:34