0

I'm trying to do the simplest thing here. I want to create a method that will take in a byte (char) array, inflate it using miniz tinfl_decompress method and then return a byte array containing the inflated data.

First things first. The arrays given will never be bigger than 100kB, vast majority will be smaller than 50k. Hence, I don't think I need to use any kind of buffer for it. Anyway, this is what I've got:

std::vector<unsigned char> unzip(std::vector<unsigned char> data)
{
    unsigned char *outBuffer = new unsigned char[1024 * 1024];

    tinfl_decompressor inflator;
    tinfl_status status;
    tinfl_init(&inflator);

    size_t inBytes = data.size() - 9;
    size_t outBytes = 1024 * 1024;

    status = tinfl_decompress(&inflator, (const mz_uint8 *)&data[9], &inBytes, outBuffer, (mz_uint8 *)outBuffer, &outBytes, 0);

    return ???
}

I know the output I want begins at memory location &outBuffer, but I don't know how long it is (I do happen to know it will be less than 1MB), so I cannot pack it into a vector and send it on it's way. I had hoped that outBytes would hold the size of the output, but they are set to 1 after the decompression. I know that decompression didn't fail, since status returned is TINFL_STATUS_DONE (0).

Is this even the right way of doing it? This is a method that will be called a lot in my program, so I want something that is as fast as possible.

How do I get the vector out of it? Should I use a different data type? An array (the [] type)? The decompressed data will be read sequentially only once, after what it will be discarded.

EDIT:

It seems that the file I was trying to decompress was not of the proper format; it was zip, this takes zlib.

Karlovsky120
  • 6,212
  • 8
  • 41
  • 94
  • Why don't you use `tinfl_decompress_mem_to_mem()`? You should take your input as `std::vector const &data`. Once you figure out the size of the expanded data, you can `return std::vector(outBuffer, outBuffer + length)` – Khouri Giordano May 09 '17 at 21:01
  • That's why I asked. But how do I figure out the size of the decompressed data? – Karlovsky120 May 09 '17 at 21:02
  • You tell it the address of the size of your output buffer and it modifies that value. – Khouri Giordano May 09 '17 at 21:06

1 Answers1

2

Caveat: Totally untested code.

It should go something like exchange

unsigned char *outBuffer = new unsigned char[1024 * 1024];

for

std::vector<unsigned char> outBuffer(1024 * 1024);

to get a vector. Then call tinfl_decompress using the data method to get the vector's underlying buffer. It should look something like

status = tinfl_decompress(&inflator, 
                          (const mz_uint8 *)&data[9], 
                          &inBytes, 
                          (mz_uint8 *)outBuffer.data(), 
                          (mz_uint8 *)outBuffer.data(), 
                          &outBytes, 
                          0);

And then resize the vector to the number of bytes stored in the vector for convenience later.

outBuffer.resize(outBytes);

Note the vector will NOT be resized down. It will still have a capacity of 1 MiB. If this is a problem, an additional call to std::vector::shrink_to_fit is required.

Finally

return outBuffer;
user4581301
  • 33,082
  • 7
  • 33
  • 54
  • I'm not getting valid output. Will this work for data compressed using SharpZipLib? Are those the same algorithms that are used? – Karlovsky120 May 09 '17 at 21:12
  • Couldn't tell you, but if both are following the ZIP specification, both should read each other's files. First confirm that an off-the-shelf utility can read the file. All I did here was apply a generic method for reading an unknown amount of data directly into a vector. – user4581301 May 09 '17 at 21:28
  • Okay, Windows 10 doesn't seem to have any problems reading the file, however, the miniz seems to fail when reading it. Any ideas what might be the culprit? – Karlovsky120 May 09 '17 at 22:21
  • It's the line 1464 in miniz.c that messes it up: `if ((counter = (r->m_raw_header[0] | (r->m_raw_header[1] << 8))) != (mz_uint)(0xFFFF ^ (r->m_raw_header[2] | (r->m_raw_header[3] << 8)))) { TINFL_CR_RETURN_FOREVER(39, TINFL_STATUS_FAILED); }` – Karlovsky120 May 09 '17 at 22:22