2

I have around 500 zip files that i have to place in a directory for a process to process them.

The process calculates the md5 hash value of the file before processing it. If the hash value of the file is the same as one it is aware of (stored in a database) it ignores the zip file and continues to the next one.

Each zip file contains two files:

  • An excel document
  • An XML file with some metadata about the excel document

As part of testing, i need to process all zip files several times. Currently, to allow the process to not ignore them, i just clear the hash values from the database before i run my tests.

Is there a way i can run the tests without having to clear the recorded hash values? I tried renaming the zip files but that does not seem to change the hash value.

Is there a quick way to make a change to the files in the zip files so that the hash value changes every time (using any tool in Unix or Windows)?

As an example, if i extract all the files (each pair in its own folder), is there a way i can make a small change to the xml and re-zip the files (assuming it is going to be more tricky to update the excel document)?

Thanks

ziggy
  • 15,677
  • 67
  • 194
  • 287
  • I don't see how any process you invent that modifies files so it will generate a new hash value **+** the time to actually calculate the hash value can be faster (or easier) than just an `update table set hashed_flag=false where ....` in a database would be. Time to move on to more pressing problems ;-) Good luck. – shellter Aug 20 '16 at 17:51
  • Yes i see your point. For day to day development the developers clear the hashes. For formal end to end functional testing, clearing the hash table is cheating :) – ziggy Aug 21 '16 at 11:41
  • 1
    Glad to be shown to be wrong and that `-z` is a quick mod to a file. I didn't realize you meant Testing (with a capital T). I salute your vigilance! Good luck to all (I did upvote the answer) – shellter Aug 21 '16 at 12:43

1 Answers1

4

You can change the comment of the zip file (-z or --archive-comment option). This comment is stored at the end of the zip file and doesn't change the files stored inside:

md5sum test.zip # e5d3c08c0c45d1cbdd0760af0b29e8e0
echo "test at $(date)"|zip --archive-comment test.zip
md5sum test.zip # b13fd1f3b75561eec68fdfde0c26a466
David Duponchel
  • 3,959
  • 3
  • 28
  • 36