Without being able to control or add headers server-side, is it possible to compare a local checksum against a remote file without downloading the entire file and comparing checksums using Ruby and Net::HTTP?
I'm populating a disk with files using a class I've written using Net::HTTP and would like to increase my bandwidth-thriftiness via comparison of the remote file against a SHA256 sum of my local file; I only want to download a remote file when my local copy doesn't match the remote version.
Here are my assumptions:
The filenames may be the same, but the contents may differ.
The 'Last-modified' date in the HTTP headers is not a good indication of a change - a
cp /dir_a/file1.tar /dir_b/file2.tar
results in identical checksums, but differing 'Last-modified' times.HTTP header Etags are not a good indicator: http://example.org/file1.tar and http://example.iana.org//file1.tar may have different Etags for the same file.
HTTP header Etags are not entirely standard -- while EC2 uses md5sums to generate their Etags, other hosts may not. This makes local generation of this tagging value difficult.
Maintaining a hash/dictionary of hostname-to-Etag implementations is unwieldy and a bad approach.
While I'm relatively certain that the server-side software would have to provide a facility for doing a file/tag/checksum comparison to accomplish this goal (e.g. a checksum field in the header or separate look-up file), I would like confirmation of my assumptions before abandoning this pursuit. I've left out my existing code to avoid distraction, as I'm looking to how to approach implementation.