4

I am working on a Huffman java application and i'm almost done. I have one problem though. I need to save a String of something like: "101011101010" to a file. When I save it with my current code it saves it as characters which take up 1 byte every 0 or 1. I'm pretty sure it's possible to save every 0/1 as a bit.

I already tried some things with BitSet and Integer.valueOf but I can't get them to work. This is my current code:

FileOutputStream fos = new FileOutputStream("encoded.bin");
fos.write(encoded.getBytes());
fos.close();

Where 'encoded' is a String which can be like: "0101011101". If I try to save it as integer the leading 0 will be removed.

Thanks in advance!

EDIT: Huffman is a compression method so the outputted file should be as small as possible.

Luud van Keulen
  • 1,204
  • 12
  • 38
  • Why do you want the string to get converted to integer ? Is it not possible to save the string with leading 0s when a string without leading 0s can be saved ? What exactly is your problem ? – Patrick Sep 24 '16 at 21:44
  • Well it's a compression method. So one 'a' or 'b' gets translated to something like 0110 (which is 4 bits and not 1 byte). The problem is that I'm saving the 1's and 0's as 1 byte so there is no compression (its even worse now). – Luud van Keulen Sep 24 '16 at 22:01

3 Answers3

3

I think I found my answer. I put the 1's and 0's in a BitSet using the following code:

BitSet bitSet = new BitSet(encoded.length());
int bitcounter = 0;
for(Character c : encoded.toCharArray()) {
    if(c.equals('1')) {
        bitSet.set(bitcounter);
    }
    bitcounter++;
}

After that I save it to the file using bitSet.toByteArray() When I want to read it again I convert it back to a bitset using BitSet.valueOf(bitSet.toByteArray()). Then I loop through the bitset like this:

String binaryString = "";
for(int i = 0; i <= set.length(); i++) {
    if(set.get(i)) {
        binaryString += "1";
    } else {
        binaryString += "0";
    }
}

Thanks to everyone who helped me.

Luud van Keulen
  • 1,204
  • 12
  • 38
0

Binary files are limited to storing bits in multiples of eight. You can solve this problem by chopping the string into eight-bit chunks, converting them to bytes using Byte.parseByte(eightCharString, 2) and adding them to a byte array:

  • Compute the length of the byte array by dividing the length of your bit string by eight
  • Allocate an array of bytes of the desired length
  • Run a loop that takes substrings from the string at positions representing multiples of eight
  • Parse each chunk, and put the result into the corresponding byte
  • Call fos.write() on the byte array
Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
0

Try this.

String encoded = "0101011101";
FileOutputStream fos = new FileOutputStream("encoded.bin");
String s = encoded + "00000000".substring(encoded.length() % 8);
for (int i = 0, len = s.length(); i < len; i += 8)
    fos.write((byte)Integer.parseInt(s.substring(i, i + 8), 2));
fos.close();
  • If i'm not mistaken this will get the decimal value of the BinaryString right? It works when i'm saving it but when I try to read it it will remove all the leading zero's. For instance: I do `Integer.parseInt("00001111", 2)` which will return 15. When I try to do `Integer.toBinaryString(15)` it will return 1111. – Luud van Keulen Sep 24 '16 at 21:49
  • You should do like `String s = "00000000" + Integer.toBinaryString(15);` and `String decoded = s.substring(s.length() - 8);` –  Sep 24 '16 at 21:59