8

I'm trying to zip and unzip data in memory (so, I cannot use FileSystem), and in my sample below when the data is unzipped it has a kind of padding ('\0' chars) at the end of my original data.

What am I doing wrong ?

    [Test]
    public void Zip_and_Unzip_from_memory_buffer() {
        byte[] originalData = Encoding.UTF8.GetBytes("My string");

        byte[] zipped;
        using (MemoryStream stream = new MemoryStream()) {
            using (ZipFile zip = new ZipFile()) {
                //zip.CompressionMethod = CompressionMethod.BZip2;
                //zip.CompressionLevel = Ionic.Zlib.CompressionLevel.BestSpeed;
                zip.AddEntry("data", originalData);
                zip.Save(stream);
                zipped = stream.GetBuffer();
            }
        }

        Assert.AreEqual(256, zipped.Length); // Just to show that the zip has 256 bytes which match with the length unzipped below

        byte[] unzippedData;
        using (MemoryStream mem = new MemoryStream(zipped)) {
            using (ZipFile unzip = ZipFile.Read(mem)) {
                //ZipEntry zipEntry = unzip.Entries.FirstOrDefault();
                ZipEntry zipEntry = unzip["data"];
                using (MemoryStream readStream = new MemoryStream()) {
                    zipEntry.Extract(readStream);
                    unzippedData = readStream.GetBuffer();
                }
            }
        }

        Assert.AreEqual(256, unzippedData.Length); // WHY my data has trailing '\0' chars like a padding to 256 module ?
        Assert.AreEqual(originalData.Length, unzippedData.Length); // FAIL ! The unzipped data has 256 bytes
        //Assert.AreEqual(originalData, unzippedData); // FAIL at index 9
    }
Luciano
  • 2,695
  • 6
  • 38
  • 53
  • 1
    The `MemoryStream` is using a byte array (the buffer) under the hood, it will increase (i.e. double) its size when that data to be written does not fit. `readStream.GetBuffer();` will give you that whole buffer. – bitbonk Dec 19 '12 at 16:02

2 Answers2

8

From MSDN

"Note that the buffer contains allocated bytes which might be unused. For example, if the string "test" is written into the MemoryStream object, the length of the buffer returned from GetBuffer is 256, not 4, with 252 bytes unused. To obtain only the data in the buffer, use the ToArray method;

So you actually want to change the line: zipped = stream.GetBuffer();

To the line: zipped = stream.ToArray();

Blachshma
  • 17,097
  • 4
  • 58
  • 72
  • 1
    Ok, I got it. To fix it we need to replace all calls from `GetBuffer` to `ToArray`. In this sample are 2 calls that shall be replaced. Thanks. – Luciano Dec 19 '12 at 16:40
1

I suspect it is from 'MemoryStream.GetBuffer()'

http://msdn.microsoft.com/en-us/library/system.io.memorystream.getbuffer.aspx

Note that the buffer contains allocated bytes which might be unused. For example, if the string "test" is written into the MemoryStream object, the length of the buffer returned from GetBuffer is 256, not 4, with 252 bytes unused. To obtain only the data in the buffer, use the ToArray method; however, ToArray creates a copy of the data in memory.

Jason Whitted
  • 4,059
  • 1
  • 16
  • 16