10

How can I see if a byte array contains a gzip stream? My application gets files from other applications through http post with Base64 encoding. Depending on the implementation of the application that delivers the files, the byte array that comes out of the Base64 string can be gzipped. How can I recognize the gzipped arrays? I have found some method, but I think that it will go wrong when someone uploads a zip file or prepared "bad" zip file

This is what I found and works, but can it be exploited in some way?

C#

public static bool IsGZip(byte[] arr)
{
    return arr.Length >= 2 && arr[0] == 31 && arr[1] == 139;
}

VB.NET

Public Shared Function IsGZip(arr As Byte()) As Boolean
    Return arr.Length >= 2 AndAlso arr(0) = 31 AndAlso arr(1) = 139
End Function

If the IsGzip returns true my application will decompress the byte array.

Mark Adler
  • 101,978
  • 13
  • 118
  • 158
Kees
  • 1,408
  • 1
  • 15
  • 27
  • It is questionable approach: explicitly specifying format is much safer - consider redesign. I.e. is `docx` (or other zipped document file format) considered "Zip" or "single document"? – Alexei Levenkov Oct 14 '13 at 16:15
  • 4
    According to the [RFC for Gzip](http://www.gzip.org/zlib/rfc-gzip.html), the header of a Gzip stream will always start with 0x1F8B, which is what that code snippet is testing. It may give a "false positive" if I just upload a random file that starts with the header but isn't actually Gzip content. You would also get a much more "positive" result if you also checked the CRC of the stream. The only Foolproof way to tell is to actually try decompressing it. – vcsjones Oct 14 '13 at 16:18
  • The applications that delivers the files are external applications that do give a flag through POST data that the files are zipped, but we cannot completely rely on the fact that they declare it is some file type. We have to find out for our self to ensure security. – Kees Oct 14 '13 at 16:21
  • 1
    @vcsjones: What if someone uploads a bad gzip stream file? By bad I mean a never ending file that turns 10 bytes to 1PT or more. – Kees Oct 14 '13 at 16:24
  • @KeesdeWit That's probably a DoS, your application will run out of memory trying to allocate that byte array, but that will probably happen before this function call is ever made. – vcsjones Oct 14 '13 at 16:25
  • @vcsjones: How can I prevent that? – Kees Oct 14 '13 at 16:49

1 Answers1

5

Do what you're doing, check that the third byte is 8, and then try to gunzip it. That final step is the only way to really know. If the gunzip fails, then use the stream as is.

Mark Adler
  • 101,978
  • 13
  • 118
  • 158