65

I understand that GZIP is a combination of LZ77 and Huffman coding and can be configured with a level between 1-9 where 1 indicates the fastest compression (less compression) and 9 indicates the slowest compression method (best compression).

My question is, does the choice of level only impact the compression process or is there an additional cost also incurred in decompression depending on the level used to compress?

I ask because typically many web servers will GZIP responses on the fly if the client supports it, e.g. Accept-Encoding: gzip. I appreciate that when doing this on the fly a level such as 6 might be the good choice for the average case, since it gives a good balance between speed and compression.

However, if I have a bunch of static assets that I can GZIP just once ahead of time - and never need to do this again - would there be any downside to using the highest but slowest compression level? I.e. is there now an additional overhead for the client that would not have been incurred had a lower compression level been used.

David
  • 7,652
  • 21
  • 60
  • 98
  • While the below answers provide data which is likely accurate for real-world usage today, a couple of add-on points here. First, my understanding of zlib (including gzip) is most of the compression time is from doing some wasted guesses on what will compress well, which is a cost the decompressor doesn't face, so slowness is just compression. But the answer ought to possibly vary by algorithm, and other details like hardware speed. e.g., how fast does the CPU compress/decompress, and how much does smaller compressed data help time costs from when other hardware handles (compressed) data. – TOOGAM Dec 22 '22 at 07:22

3 Answers3

62

Great question, and an underexposed issue. Your intuition is solid – for some compression algorithms, choosing the max level of compression can require more work from the decompressor when it's unpacked.

Luckily, that's not true for gzip – there's no extra overhead for the client/browser to decompress more heavily compressed gzip files (e.g. choosing 9 for compression instead of 6, assuming the standard zlib codebase that most servers use). The best measure for this is decompression rate, which for present purposes is in units of MB/sec, while also monitoring overhead like memory and CPU. Simply going by decompression time is no good because the file is smaller at higher compression settings, and we're not controlling for that factor if we're only using a stopwatch.

  1. gzip decompression quickly gets asymptotic in terms of both time-to-decompress and memory usage once you get past level 6 compressed content. The time-to-decompress flatlines for levels 7, 8, and 9 in the test results linked by Marcus Müller, though that's coarse-grained data given in whole seconds.

  2. You'll also notice in those results that the memory requirements for decompression are flat for all levels of compression at 0.1 MiB. That's almost unbelievable, just a degree of excellence in software that we rarely see. Mark Adler and colleagues deserve massive props for what they achieved. gzip is a very nice format.

The memory use gets at your question about overhead. There really is none. You don't gain much with level 9 in terms of browser decompression speed, but you don't lose anything.

Now, check out these test results for a bit more texture. You'll see how the gzip decompression rate is slightly faster with level 9 compressed content than with lower levels (at level 9, decomp rate is about 0.9% faster than at level 6, for example). That is interesting and surprising. I wouldn't expect the rate to increase. That was just one set of test results – it may not hold for other scenarios (and the difference is quite small in any case).

Parting note: Precompressing static files is a good idea, but I don't recommend gzip at level 9. You'll get smaller files than gzip-9 by instead using zopfli or libdeflate. Zopfli is a well-established gzip compressor from Google. libdeflate is new but quite excellent. In my testing it consistently beats gzip-9, but still trails zopfli. You can also use 7-Zip to create gzip files, and it will consistently beat gzip-9. (In the foregoing, gzip-9 refers to using the canonical gzip or zlib application that Apache and nginx use).

LearningFast
  • 1,163
  • 13
  • 13
  • Do you happen to know if Brotli decompression also isn't negatively impacted by a higher compression level? – jonsploder May 09 '22 at 01:19
22

No, there is no downside on the decompression side when using the maximum compression level. In fact, there is a slight upside, in that better-compressed data decompresses faster. The reason is simply fewer compressed bits that the decompressor has to process.

Mark Adler
  • 101,978
  • 13
  • 118
  • 158
8

Actually, in real world measurements a higher compression level yields lower decompression times (which might be primarily caused by the fact that you need to handle less permanent storage and less RAM access).

Since, actually, most things that happen at a client with the data are rather expensive compared to gunzipping, you shouldn't really care about that, at all.

Also be advised that for static assets that are images, usually huffman/zlib coding (PNG simply uses zlib!) is already applied, and you won't gain much by gzipping these. Actually, often small images (for example, icons) fit into a single TCP packet (ignoring the HTTP header, which sometimes is bigger than the image itself) and therefore you don't get any speed gain (but save money on transfer volume -- if you deliver terabytes of small images. Now, may I presume you're not Google itself...

Also, I'd like to point you to higher level optimization, like tools that can transform your javascript code into a compacter shape (eg. removing whitespace, renaming private variables from my_mother_really_likes_this_number_of_unicorns to m1); also, things like JQuery come in a "precompressed" form. The same exists for HTML. Doesn't make things easier to debug, but since you seem to be interested in ultimate space saving...

Marcus Müller
  • 34,677
  • 4
  • 53
  • 94
  • Thanks for the response and for the link to the benchmarks. I'm already using minification for all static assets as part of my Grunt build :) And quite right, i'm not Google + cost of bandwidth isn't really a concern either. But, since I'm gzipping my text based assets anyway (HTML/CSS/JS) it isn't any more effort to use the -9 flag instead of -6 as part of the build apart from adding a few seconds to the build. My main concern was whether doing so would put any extra burden on the client having to decompress content. Based on those results it would seem not - in fact it could be even faster. – David Feb 11 '15 at 11:32
  • you'd actually have to distinguish between "no burden" and "faster", because any OS will switch process or threads when a gzip-decoding thread would call a file read or network receive call that can't be immediately replied with data, so although the larger files might take longer to decode, that might not be due to CPU time, but increased time that the decoder waits for data --which is time that the actual computer running the client software might do something else.But for all practical purposes, this shouldn't be measurably worse for clients (unless you try to port chrome to a calculator). – Marcus Müller Feb 11 '15 at 12:01
  • 2
    The reason decompression is faster at a higher compression level on the same input data is very simply due to the fact that the decompressor has less input data that it has to process. – Mark Adler Feb 11 '15 at 21:51
  • @MarkAdler uff I just realized that you're *that* Adler. I'm humbled by the fact that you read my answer on compression! – Marcus Müller May 24 '18 at 09:22
  • @RonJohn thanks for the heads up! Replaced it with an archive.org snapshot. – Marcus Müller Feb 14 '21 at 16:22