1

I'm writing a java application which compresses text with a Huffman tree.
The result I get is an array of ones and zeroes. Right now it's a byte[].

Only I want to write 8 ones and zeroes to a byte because else "01" would take up twice as much storage as just the character "e".

Right now I have been using

byte[] output = loadedTree.encode(text);

try(ByteOutputStream boas = new ByteOutputStream()) {
    boas.write(output);
    boas.writeTo(new FileOutputStream(basePath + name));
} catch (IOException e) {
    e.printStackTrace();
}

In the output array, each element is a (byte) 1 or a (byte) 0.

How can I write this data to a file if I have, e.g. 7 or 9 numbers?

Maxim
  • 52,561
  • 27
  • 155
  • 209
Tvde1
  • 1,246
  • 3
  • 21
  • 41
  • Byte is minimal chunk of information that can be read or written to a file. So you'll have to pad it 8 bits – Maxim Mar 06 '18 at 18:03
  • How would I go about getting those 8 byte classes that are actually only 8 bits to go into one byte? – Tvde1 Mar 06 '18 at 18:11

1 Answers1

2

The simplest way to encode bits into byte array is via BitSet:

private static byte[] bitsToBytes(byte[] bits) {
  BitSet bitSet = new BitSet();
  for (int i = 0; i < bits.length; i++) {
    bitSet.set(i, bits[i] == 1);
  }
  return bitSet.toByteArray();
}

Note bits[i] == 1 instead of just bits[i], because it would be a different method that accepts two indices, not an index and a value. Here's how you can check it:

public static void main(String[] args) {
  System.out.println(Arrays.toString(bitsToBytes(new byte[]{0})));
  System.out.println(Arrays.toString(bitsToBytes(new byte[]{1})));
  System.out.println(Arrays.toString(bitsToBytes(new byte[]{1, 1})));
  System.out.println(Arrays.toString(bitsToBytes(new byte[]{1, 0, 1})));
  System.out.println(Arrays.toString(bitsToBytes(new byte[]{1, 1, 1, 1, 1, 1, 1})));
  System.out.println(Arrays.toString(bitsToBytes(new byte[]{1, 1, 1, 1, 1, 1, 1, 0})));
  System.out.println(Arrays.toString(bitsToBytes(new byte[]{1, 1, 1, 1, 1, 1, 1, 1})));
  System.out.println(Arrays.toString(bitsToBytes(new byte[]{0, 0, 1, 1, 0, 0, 0, 0})));
  System.out.println(Arrays.toString(bitsToBytes(new byte[]{1, 1, 1, 1, 1, 1, 1, 1,
                                                            0, 0, 1, 1, 0, 0, 0, 0})));
}

The reverse conversion (as requested in the comments) in general needs the number of bits, to ignore the trailing zero bits. Here's the code:

private static byte[] bytesToBits(int nbits, byte[] bits) {
  BitSet bitSet = BitSet.valueOf(bits);
  byte[] result = new byte[nbits];
  for (int i = 0; i < result.length; i++) {
    result[i] = (byte) (bitSet.get(i) ? 1 : 0);
  }
  return result;
}
Maxim
  • 52,561
  • 27
  • 155
  • 209