2

Since expanding a large gzip file takes quite long time (sometimes over half a minute), I'd like to know the expanded size before I start the expansion (for progression report purpose). Is there a way of knowing it without actually expanding?

Update:

For file larger than 4G bytes (expanded size), there's no sure way of knowing the size without actually expanding the gzip file. However, for files smaller than 4G bytes (expanded size), the actual expanded size is stored as the last 4 bytes of the gzip file and could be retrieved easily:

(1..4).inject(0) do |v, i| 
  v += IO.read(file_name, 1, original_size - i)[0] * (2**8) ** (4 - i)
end
Svante
  • 50,694
  • 11
  • 78
  • 122
bryantsai
  • 3,405
  • 1
  • 30
  • 30
  • 1
    Duplicate: http://stackoverflow.com/questions/1965567/get-the-filesize-of-very-large-gz-file-on-a-64bit-platform/1965580#1965580 – pavium Jan 07 '10 at 07:13
  • Close to a duplicate, and the **correct answer** is contained here: [get the filesize of very large .gz file on a 64bit platform](http://stackoverflow.com/questions/1965567/get-the-filesize-of-very-large-gz-file-on-a-64bit-platform) and please **see Kevin's answer** - TL;DR: no, there is no way, because the ISIZE is "modulo 32bit". For files > 4G, you need to parse the whole file to learn the uncompressed length. – quetzalcoatl Mar 19 '17 at 11:08

2 Answers2

2

Just want to close this question as I've found the solution. Have updated it above.

bryantsai
  • 3,405
  • 1
  • 30
  • 30
1

I don't think you can get an exact size, as that would require knowing the actual frequencies of various strings in a file, and you can't do that without scanning the file. Can you go into the actual decompression function and have it indicate how far through the input it is (vs how far into the output it is)?

Yuliy
  • 17,381
  • 6
  • 41
  • 47