0

I have a directory with some files in it (they're mostly images with a JSON file). I want to process those files, possibly overwrite some of them and possibly create some new files from some other source. I can have an error happen at any point in this process.

How do I ensure that if an error occurs as I'm processing one of the files that I won't end up with the directory in a weird state? I want to only change the contents of the directory if everything went well.

Should I create a temporary directory with tempfile.mkdtemp, put my code in a try, do my update in the "temporary" directory, swap the existing directory with the temporary directory, and delete the temporary directory if it still exists in the finally?

I'm using Python (Django).

Boris Verkhovskiy
  • 14,854
  • 11
  • 100
  • 103
  • 1
    Working in a temporary directory seems like the most straightforward solution. There are packages like [AcidFS](https://docs.pylonsproject.org/projects/acidfs/en/latest/) that implement transaction semantics for filesystem operations, but unless it's something you do many times or you have more complicated use cases I'd probably avoid the additional library and stick to the simple solution. – jdehesa Mar 03 '20 at 16:53
  • @jdehesa this is something I'm doing on the server (Django) concurrently a few times a minute. My directory contains binary files (images) and AcidFS uses git as a backend, so sounds like it's not what I need. – Boris Verkhovskiy Mar 03 '20 at 17:03
  • Reading more about https://rcrowley.org/2010/01/06/things-unix-can-do-atomically.html and https://stackoverflow.com/questions/307437/moving-a-directory-atomically sounds like I need to use symlinks instead of a directory and `mkdtemp` if I want updates to actually be atomic. – Boris Verkhovskiy Mar 03 '20 at 17:05
  • Yes, if the last directory name change needs to be atomic too, a symlink may do. You could also do your own rudimentary "transaction" system where, before moving dirs, you write the names of the dirs to move to a hidden file or something, delete old dir, rename new dir and delete transaction file. If you fail at some point in between you should be able to recover from there (or simply have a "lock" file during the transaction and if it exists on startup delete the tmp dir) – jdehesa Mar 03 '20 at 17:11

0 Answers0