5

I am currently working on a huge project that possibly compress/decompress using zlib in C++ over thousands of documents a day. (Our implementation has zlib 1.2.8)

Our current implementation supports both compressed file with and without headers, however a boolean "useZlibHeader" has to be set.

Our team was wondering if instead there was a 100% reliable way to figure out if the headers are present or not.

According to this doc : https://www.ietf.org/rfc/rfc1950.txt it is said " The FCHECK value must be such that CMF and FLG, when viewed as a 16-bit unsigned integer stored in MSB order (CMF*256 + FLG), is a multiple of 31."

Indeed this is a nice check, but there are possibilities that we end up with a compressed file that has no header, but its data is made that (CMF*256 + FLG) would be a multiple of 31.

Is there a better way to detect if the headers are present or not ? Is it possible that we possibly badly detect the presence of header and that the decompression do not throw an exception, outputing bad data?

Thank you

James
  • 53
  • 3

1 Answers1

1

As a heuristic check, it will be unreliable and prone to exploit. I can conceive of generating a document which comppresses to a zlib header. Also which would produce a valid decompression stream if the header was treated as valid.

In reality, the constraints on the data being transmitted may mitigate, but it may still be dangerous

mksteve
  • 12,614
  • 3
  • 28
  • 50