I'm an intern doing research into whether using Brotli compression in a piece of software provides a performance boost over the current release, which uses GZip.
My task is to change anything using GZip to use Brotli compression instead. One function I need to replace does a check to test if a buffer contains data that was compressed using GZip. It does this by checking the stream identifier at the beginning and end:
bool isGzipped() const
{
// Gzip file signature (0x1f8b)
return
(_bufferEnd >= _bufferStart + 2) &&
(static_cast<unsigned char>(_bufferStart[0]) == 0x1f) &&
(static_cast<unsigned char>(_bufferStart[1]) == 0x8b);
}
I want to create similar function bool isBrotliEncoded()
. I was wondering if there is a similar quick check that can can be done with Brotli encoded buffers? I've had a look at the byte values for some of the compressed files that brotli produces, but I can't find a rule that holds for all of them. Some start with 0x5B
, some with 0x1B
, compression of empty files results in 0x06
, and files that have been compressed multiple times start with a range of different values. The end of each file is also inconsistent.
The only way I know of to test if it is in the correct format is to attempt decompression and wait for an error, which defeats the purpose of doing this test.
So my question is: Does anyone know how to check if a buffer has been compressed with Brotli without attempting decompression and waiting for failure?