6

A great answer here explains how, in Ruby, to download a file without loading it into memory:

https://stackoverflow.com/a/29743394/4852737

require 'open-uri'
download = open('http://example.com/image.png')
IO.copy_stream(download, '~/image.png')

How would I verify that the IO.copy_stream call to download the file was actually successful — meaning that the downloaded file is the exact same file I intended to download, not a half downloaded corrupt file? The documentation says that IO.copy_stream returns the number of bytes that it copied, but how can I know the number of bytes expected when I have not downloaded the file yet?

Community
  • 1
  • 1
joshweir
  • 5,427
  • 3
  • 39
  • 59
  • What does success mean to you? That the right number of bytes are on the disk? That the file is a valid file of its type? Something else? – Dave Schweisguth Mar 08 '16 at 03:13
  • @DaveSchweisguth Sorry I thought it would be self explanatory. That the downloaded file is an exact copy of the file at the url. I would automatically think cksum but I know that isnt possible. So maybe the only option is to verify that the size of the files match? – joshweir Mar 08 '16 at 03:15
  • 1
    You should be careful here: servers are not required to return 'Content-Length' in the headers. – engineerGuido Dec 07 '16 at 11:57

1 Answers1

11

OpenURI open returns an object which exposes the HTTP response's headers, so you can get the expected byte count from the Content-Length header and compare it to the return value of IO.copy_stream:

require 'open-uri'
download = open 'http://cdn.sstatic.net/stackoverflow/img/apple-touch-icon.png'
bytes_expected = download.meta['content-length']
bytes_copied = IO.copy_stream download, 'image.png'
if bytes_expected != bytes_copied
  raise "Expected #{bytes_expected} bytes but got #{bytes_copied}"
end

It would be surprising if open executed without error and this check still failed, but you never know.

Dave Schweisguth
  • 36,475
  • 10
  • 98
  • 121