0

I am making a text file compression tool that first compresses the file using Run-length encoding and then further compresses it with Huffman encoding.

After Compressing the file why do we store it in .bin format?

Let's consider that the file to be compressed contains the following data :

"aaaaaaaaaabbbbbbbaaaaaaaaaabbbcccccccccccccddddddddddddddeeeeeeeeeeeeee"

I have read that our computer stores all the characters as 8 bits. But after applying Huffman we can store it in less number of bits.

So after applying Huffman let's say we get to know that we can represent the character "a" as "000". So if we store it in a .txt file we will use 8*3 = 24 bits instead of 8 bits.

I think that's why we don't store it in .txt format. So my first question is does the .bin file store the bits as it is like 0 will take 1 bit and 1 will also take 1 bit instead of 8 bits each.

My other question is if I apply run-length encoding with Huffman encoding, will it compress my file even more.

  • 4
    `.bin` is not a format, it's just a convention used to indicate that the file is binary and shouldn't be opened with a text editor. – Mark Ransom Jun 24 '21 at 16:10
  • 2
    `if I apply run-length encoding with Huffman encoding, will it compress my file even more` applying run-length encoding _before_ you apply Huffman may very well have good results for certain inputs, think black and white image files. However, applying run-length encoding on the Huffman output will prove unrewarding. – 500 - Internal Server Error Jun 24 '21 at 16:29

0 Answers0