I am putting together a program that takes care of moving (large) files from one place to another. These files are usually 1gb + and are incredibly important to us. We are a data acquisition company, so data is literally our product.
What I'd like to do is calculate MD5 (or some other validation method) -> Copy/move the file to it's destination -> compare the original and copied file's MD5 (or other)
Since calculating the MD5 requires reading the whole file, I was wondering if there was a way to combine it with the actual copy of the file, requiring it to be read beginning to end only once.
Also, the transfers will likely be from one network location to another, so if there is a faster/lighter (than MD5) way to validate both files are identical, please let me know! I'd like to prevent bogging down the network if I can.
P.S. It's important that the whole file not be stored in memory as some of them can get as big as 300 GB.