0

OK, I have a large folder with millions of binary files. It is possible that these files are altered by a remote process, and I need to know when that happens, BUT… I don't want to store a second copy of these files (inside a repo) and I cannot modify or alter these files (e.g. moving them, and leaving a symlink in their place, a la git annex).

So, ideally this tool would track the files via their metadata, i.e. their size, modification date, and a file hash. Then, when running whatever equivalent of git status, I could see which files have been modified (i.e. their metadata no longer matches). I can then stage and commit those changes, thus, updating the metadata in the repo.

Obviously, I would not be able to restore or rollback anything from this repo.

Is this possible?

Git-Annex seemed pretty close, but I can't have the originals moved and replace with symlinks. I stumbled upon git-annex "Direct Mode", which SEEMINGLY does what I want, but I'm concerned that it's deprecated? Is that the best tool for this?

jaydisc
  • 369
  • 2
  • 11
  • This sounds much more like a task for a database instead of git. You would store the metadata (date of creation/modification, file hash´and location) and then query the database whenever you need to access the file. This might help: http://www.makeuseof.com/tag/learn-sql-simple-database-sqlite/ – mattmilten Jun 16 '16 at 06:26
  • seems more like a question for SU. – Oliver Friedrich Jun 16 '16 at 07:24

1 Answers1

2

This sounds like a perfect job for the tripwire. Tripwire takes a snapshot of your files with the metadata you mentioned and lets you know, if those files are altered.

Oliver Friedrich
  • 9,018
  • 9
  • 42
  • 48