Compress a file chunk by chunk - miniz

Question

I'm building a program with help of the library [miniz][1] for compressing files with sizes up to 3GB. The computer that will run this program will also run another (heavy) application and therefore I want this compressing program to load a chunk of each file to prevent it to use a lot of RAM (max size of chunk = 0.5 GB ), compressing that chunk and then proceed the next chunk until all files are compressed.

Right know, this does not work as I want, for example: if a file named problem.txt is divided into 10 chunks, I get 10 files named problem.txt in my zip folder. Obviously I want the chunks to be merged together instead of being splitted in the zip.

Is this possible to do with miniz? The following text is written in the libary file(the libary contains only one file) so I guess it is not possible but I ask anyway to see if anyone has a solution or another approach so the program does not eat all the memory.

The ZIP archive API's where designed with simplicity and efficiency in mind, with just enough abstraction to
 get the job done with minimal fuss. There are simple API's to retrieve file information, read files from
 existing archives, create new archives, append new files to existing archives, or clone archive data from
 one archive to another. It supports archives located in memory or the heap, on disk (using stdio.h),
 or you can specify custom file read/write callbacks.

The program crash with files larger then 0.9 GB. [1]: https://code.google.com/p/miniz/ .

Please note that the program store the whole file in the std::vector filesdata. Each element is a chunk of data. In the final version, just a chunk shall be read and stored in the program at one time. The problem in this version is that the library creates many files with the same name in the .zip as described above.
Do I use the lib wrongly right now? I open the files myself and store the data in the vector because I could not figure out how to make function open the file itself.

  for (i = 0; i < filesData.size(); ++i)
  {

    sprintf(data.at(i), filesData.at(i) );
    sprintf(archive_filename,  fileNames.at(i) );

    // Add a new file to the archive. Note this is an IN-PLACE operation, so if it fails your archive is probably hosed (its central directory may not be complete) but it should be recoverable using zip -F or -FF. So use caution with this guy.
    // A more robust way to add a file to an archive would be to read it into memory, perform the operation, then write a new archive out to a temp file and then delete/rename the files.
    // Or, write a new archive to disk to a temp file, then delete/rename the files. For this test this API is fine.
    status = mz_zip_add_mem_to_archive_file_in_place(s_Test_archive_filename, archive_filename, data.at(i), strlen(data.at(i)) + 1, s_pComment, (uint16)strlen(s_pComment), MZ_BEST_COMPRESSION);
    if (!status)
    {
      printf("mz_zip_add_mem_to_archive_file_in_place failed! 2\n");
      return EXIT_FAILURE;
    }
  }

I made a small test program(which fails).

#include "miniz.c"

#if defined(__GNUC__)
  // Ensure we get the 64-bit variants of the CRT's file I/O calls
  #ifndef _FILE_OFFSET_BITS
    #define _FILE_OFFSET_BITS 64
  #endif
  #ifndef _LARGEFILE64_SOURCE
    #define _LARGEFILE64_SOURCE 1
  #endif
#endif

typedef unsigned char uint8;
typedef unsigned short uint16;
typedef unsigned int uint;


int main(int argc, char *argv[])
{

    mz_zip_archive zip_archive;
    const char *s_Test_archive_filename = "__mz_example2_test__.zip";
    const char *s_pComment = "This is a comment";

    remove(s_Test_archive_filename);

    printf (argv[1] );

    bool status = mz_zip_writer_add_file( (&zip_archive), s_Test_archive_filename, argv[1], s_pComment, (uint16)strlen(s_pComment), MZ_BEST_COMPRESSION  );

    if (!status)
    {
        printf("mz_zip_reader_init_file() failed!\n");
        return EXIT_FAILURE;
    }
    else
    {
            printf("success\n");
    }
    return 0;
}

I'm pretty sure miniz only reads a few kilobytes of the file it currently compresses into the RAM. Have you actually done benchmarks to verify that it consumes that much RAM? — fuz, Jul 06 '15 at 10:30
Also, please pick one of [C] and [C++]. These two languages are distinct. — fuz, Jul 06 '15 at 10:31
It seems like you are using miniz incorrectly. Let me have a look at the miniz API. — fuz, Jul 06 '15 at 13:27

Compress a file chunk by chunk - miniz

0 Answers0