8

I am looking for an efficient way how to easily convert a BitSet to a binary string. Let us say that its usual length would be thousands of bits.

For example, lets have this:

BitSet bits = new BitSet(8);
bits.set(1);
bits.set(3);

And this is the desired result:

String result = toBinaryString(bits);
// expected: result = "01010000"

I have some ideas in general (streams, etc.), but there might be some obvious standard method I am just missing.

Kedar Mhaswade
  • 4,535
  • 2
  • 25
  • 34
voho
  • 2,805
  • 1
  • 21
  • 26

3 Answers3

7

So this is the most efficient way I have tried so far:

private static class FixedSizeBitSet extends BitSet {
    private final int nbits;

    public FixedSizeBitSet(final int nbits) {
        super(nbits);
        this.nbits = nbits;
    }

    @Override
    public String toString() {
        final StringBuilder buffer = new StringBuilder(nbits);
        IntStream.range(0, nbits).mapToObj(i -> get(i) ? '1' : '0').forEach(buffer::append);
        return buffer.toString();
    }
}

Or other way using more streams:

@Override
public String toString() {
    return IntStream
            .range(0, nbits)
            .mapToObj(i -> get(i) ? '1' : '0')
            .collect(
                    () -> new StringBuilder(nbits),
                    (buffer, characterToAdd) -> buffer.append(characterToAdd),
                    StringBuilder::append
            )
            .toString();
}
voho
  • 2,805
  • 1
  • 21
  • 26
0
String bitsStr = String.format("%8s", Integer.toBinaryString(bits.toByteArray()[0] & 0xFF)).replace(' ', '0');
Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
  • 1
    Please add code to a code block or tag it between backticks ` to format it correctly. Also, explain how the code works to answer the question – tshimkus Feb 24 '19 at 14:38
0

Update:

It turns out that the real culprit here is BitSet.valueOf which doesn't work the way you would expect it to work. I'm going to leave my original comment here because it could be valuable to someone.

The important things to know are that if you call BitSet.valueOf(some byte array) it's not going to do what you think it should do and that you can't rely on BitSet.toByteArray() either.

import java.util.Arrays;
import java.util.BitSet;

public class MoreBitSetExperiments {

  public static void main(String[] args) {

    // create 10000000 00000001 00000000 as bit set using set() method
    int bits = 24;
    BitSet bitSet1 = new BitSet(bits);
    bitSet1.set(0);
    bitSet1.set(15);

    // create 10000000 00000001 00000000 as bit set using valueOf() method
    BitSet bitSet2 = BitSet.valueOf(new byte[] {(byte)0b10000000, (byte)0b0000001, (byte)0b00000000});

    System.out.println("bitSet1.toByteArray(): " + Arrays.toString(bitSet1.toByteArray()));
    System.out.println("bitSet2.toByteArray(): " + Arrays.toString(bitSet2.toByteArray()));

    String binary1 = "";
    String binary2 = "";
    for (int i = 0; i < bits; i++) {
      binary1 = binary1 + (bitSet1.get(i) ? 1 : 0);
      binary2 = binary2 + (bitSet2.get(i) ? 1 : 0);
    }
    System.out.println("bitSet1 binary: " + binary1);
    System.out.println("bitSet2 binary: " + binary2);
  }
}

Output:

bitSet1.toByteArray(): [1, -128]
bitSet2.toByteArray(): [-128, 1]
bitSet1 binary: 100000000000000100000000
bitSet2 binary: 000000011000000000000000

To get back to the original question, as long as you create your bit set by calling the BitSet.set() method you can get a binary string with bits in the order you expect. But if you create your bit set using BitSet.valueOf() then your bytes will be backwards. Also, since you can't know how many 0 bytes should be at the end of your bit set, any code written to convert to binary needs to know the intended length in bytes of the output.

Original Post

I've spent the last several days debugging some issues related to bitsets so I thought I'd share.

This code demonstrates that there are 3 big unexpected behaviors with bit sets. This is because the bit set internal storage mechanism is not what you expect it to be at first glance:

import java.util.Arrays;
import java.util.BitSet;

public class BitSetExperiments {

  // this class seeks to demonstrate some of the pitfalls of using the BitSet object

  public static void main(String[] args) {
    byte[] bytes = new byte[] {-128, 1, 0}; // 10000000 00000001 00000000 as binary
    BitSet bitSet = BitSet.valueOf(bytes);

    // unexpected behavior demonstrated here
    System.out.println("problem 1: BitSet.toByteArray() produces unexpected results (trailing 0 bytes are not captured):");
    System.out.println("       expected byte array : [-128, 1, 0]");
    System.out.println("actual bitset.toByteArray(): " + Arrays.toString(bitSet.toByteArray()));
    System.out.println();

    System.out.println("problem 2: BitSet stores the bits of a byte in the wrong order:");
    System.out.println("expected binary: 100000000000000100000000");
    System.out.print("actual binary:   ");
    for (int i = 0; i < bytes.length * 8; i++) {
      System.out.print(bitSet.get(i) ? 1 : 0);
    }
    System.out.println("\n");

    System.out.println("problem 3: bitSet cannot accurately tell you the length of bytes or bits stored");
    System.out.println("expected bit/byte length is: 24 bits or 3 bytes");
    System.out.println("bitSet.size(): " + bitSet.size());
    System.out.println("bitSet.length(): " + bitSet.length());
    System.out.println("bitSet.cardinality(): " + bitSet.cardinality());
    System.out.println();
    
  }
}

Running this code shows:

problem 1: BitSet.toByteArray() produces unexpected results (trailing 0 bytes are not captured):
       expected byte array : [-128, 1, 0]
actual bitset.toByteArray(): [-128, 1]

problem 2: BitSet stores the bits of a byte in the wrong order:
expected binary: 100000000000000100000000
actual binary:   000000011000000000000000

problem 3: bitSet cannot accurately tell you the length of bytes or bits stored
expected bit/byte length is: 24 bits or 3 bytes
bitSet.size(): 64
bitSet.length(): 9
bitSet.cardinality(): 2

Ok, so knowing these 3 unexpected behaviors, we can code around them. But the important thing to realize is that you don't know how many trailing 0 bytes are on your bit set, and bit set won't tell you. This means when you want to convert the bit set to binary, you need to be explicit about how many bytes you're expecting.

Here's what I came up with:

  public static String convertBitSetToBinaryString(BitSet bitSet, int lengthInBytes) {
    if(bitSet == null) {
      return null;
    }
    if(lengthInBytes < 0) {
      return "";
    }
    StringBuilder binary = new StringBuilder();
    // compute each byte, and then reverse them before appending to the running binary
    for(int i = 0; i < lengthInBytes; i++) {
      StringBuilder backwardsByte = new StringBuilder();
      for(int j = 0; j < Byte.SIZE; j++) {
        backwardsByte.append(bitSet.get((i*Byte.SIZE)+j) ? "1" : "0");
      }
      binary.append(StringUtils.reverse(backwardsByte.toString()));
    }
    return binary.toString();
  }

note: uses StringUtils.reverse() from apache commons

And here is the test to make sure it actually works:

void testConvertBitSetToBinaryString() {
  assertNull(convertBitSetToBinaryString(null, 0));
  assertNull(convertBitSetToBinaryString(null, 1));
  assertEquals("00000000", convertBitSetToBinaryString(BitSet.valueOf(new byte[] {}), 1));
  assertEquals("00000000", convertBitSetToBinaryString(BitSet.valueOf(new byte[] {0}), 1));
  assertEquals("10000000", convertBitSetToBinaryString(BitSet.valueOf(new byte[] {-128}), 1));
  assertEquals("00000001", convertBitSetToBinaryString(BitSet.valueOf(new byte[] {1}), 1));
  assertEquals("0000000100000000", convertBitSetToBinaryString(BitSet.valueOf(new byte[] {1}), 2));
  assertEquals("0000000100000010", convertBitSetToBinaryString(BitSet.valueOf(new byte[] {1, 2}), 2));
  assertEquals("1000000010000000", convertBitSetToBinaryString(BitSet.valueOf(new byte[] {-128, -128}), 2));
  assertEquals(
      "00000001 01001101 10000000 00000000".replace(" ", ""),
      convertBitSetToBinaryStringV2(BitSet.valueOf(new byte[] {1, 77, -128, 0}), 4)
  );
}
JVM
  • 23
  • 4