0

I need to decompress the *.bin file into a certain temp-file, perform certain actions with it in the my program and then compress it. There is a function that unpacks the *.bin file. I.e., temp-file for the correct operation of the program was obtained. Function:

public void Unzlib_bin(string initial_location, string target_location)
{
    byte[] data, new_data;
    Inflater inflater;

    inflater = new Inflater(false);

    data = File.ReadAllBytes(initial_location);
    inflater.SetInput(data, 16, BitConverter.ToInt32(data, 8));
    new_data = new byte[BitConverter.ToInt32(data, 12)];
    inflater.Inflate(new_data);

    File.WriteAllBytes(target_location, new_data);
}

The question is how to pack the temp file to its original state? I'm trying to do something like the following, but the result is wrong:

public void Zlib_bin(byte[] data, int length, string target_location)
{
    byte[] new_data;
    Deflater deflater;
    List<byte> compress_data;

    compress_data = new List<byte>();
    new_data = new byte[length];
    deflater = new Deflater();

    compress_data.AddRange(new byte[] { ... }); //8 bytes from header, it does not matter
    deflater.SetLevel(Deflater.BEST_COMPRESSION);
    deflater.SetInput(data, 0, data.Length);
    deflater.Finish();
    deflater.Deflate(new_data);
    compress_data.AddRange(BitConverter.GetBytes(new_data.Length));
    compress_data.AddRange(BitConverter.GetBytes(data.Length));
    compress_data.AddRange(new_data);

    File.WriteAllBytes(target_location, compress_data.ToArray());
}

Any ideas?

Vlad i Slav
  • 59
  • 2
  • 12
  • What do you mean by "the result is wrong"? – MechMK1 May 23 '18 at 13:53
  • Binary data doesn't compress well. I would use Convert.ToBase64String(byte[]) – jdweng May 23 '18 at 13:54
  • @jdweng, I meant a file with *.bin extension. Sorry for mistake. – Vlad i Slav May 23 '18 at 14:28
  • @DavidStockinger, resulting file doesn't equals initial file. I load initial file in my program, decompress and compress it, and get other byte data. – Vlad i Slav May 23 '18 at 14:41
  • Doesn't make a difference. I was assuming you were reading the file into a byte array. Then once you have a byte array You can use ToBase64String and then to reverse process FromBase64String. The methods have built in compression with the advantage you you can transmit results without having binary data affecting modems and is compatible with html. – jdweng May 23 '18 at 23:37
  • @jdweng, you are don't understand my question. During decompressing using Inflate I get exactly that data, which I need, which needed for my program. There is no point to convert byte array into string, because then the data will be incorrect. My question is how to use Deflate and get the same data, as default. – Vlad i Slav May 24 '18 at 02:27
  • If the compression tool is designed for graphic files you will never be able to get back to original results since the compression will reduce resolution as well as file size. Using Base64 strings will compress and get original results. – jdweng May 24 '18 at 04:57
  • @jdweng base 64 does not compress. In fact it does the opposite. base 64 output is one third larger than the input. Graphic compression methods don't change resolution. Not that this is relevant because nobody other than you is talking about graphics. – David Heffernan May 24 '18 at 06:35
  • Wrong again. When it packs it will reduce arrays of 0's and 1's. – jdweng May 24 '18 at 08:58
  • @jdweng Don't let your pride get in the way of undeniable facts. From Wikipedia: "Each base64 digit represents exactly 6 bits of data. Three 8-bit bytes (i.e., a total of 24 bits) can therefore be represented by four 6-bit base64 digits." – David Heffernan May 24 '18 at 14:08
  • Read the correct article at Wiki : https://en.wikipedia.org/wiki/Uuencoding – jdweng May 24 '18 at 14:15
  • @jdweng Wow, you are really smoking today! Even by your incredible standards. It was you that mentioned base64. Now you want to pretend that you were talking about uuencode?! Anyway, uuencode is no more compression that base64 is. It also takes 6 bits at a time and encodes them as an printable ASCII character. So again, uuencode will produce output one third larger than the input. – David Heffernan May 24 '18 at 14:19
  • Read the article "uuencode/decode became popular for sending binary (and especially compressed) files by e-mail and posting to Usenet newsgroups, etc." Compression is due to following : These bytes are subtracted from the line's so that the decoder does not append unwanted characters to the file. – jdweng May 24 '18 at 14:28
  • @jdweng That's talking about compressing a file, and then uuencoding it. From the wikipedia article that you referred to: "Uuencoding takes 3 pre-formatted bytes and turns them into 4 and also adds begin/end tags, filename, and delimiters. This adds at least 33% data overhead compared to the source alone, though this can be at least somewhat compensated for by compressing the file before uuencoding it." Instead of deciding that you are right, in spite of the evidence, why not think and be self-critical? – David Heffernan May 24 '18 at 14:29
  • Anyway, all this is rather pointless because base64 (not to mention uuencode) have nothing at all to do with the question that was asked. – David Heffernan May 24 '18 at 14:30

1 Answers1

3

If by "the result is wrong" you mean that you are not able to compress back to exactly the same bytes, that is entirely to be expected. There is no guarantee that decompressing followed by compressing gives you the same thing, unless you are using exactly the same compression code, version of that code, and settings for that code.

There are many compressed data streams to represent the same uncompressed data, and compressors are free to use any of them, usually due to tradeoffs in execution time, memory used, and compressed ratio. That is why compressors have "levels" and other adjustments.

The only guarantee for a lossless compressor is that when you compress and then decompress that you get exactly what you started with.

If decompressing what you got gives the same uncompressed data, then all is well.

Mark Adler
  • 101,978
  • 13
  • 118
  • 158