1

Consider following code (java8):

@Test
public void testBigIntegerVsBitSet() throws Throwable
{
    String bitString529 = "00000010 00010001";        // <- value = 529 (LittleEndian)
    byte[] arr529 = new byte[] { 0x02, 0x11 };        // <- the same as byte array (LittleEndian)
    BigInteger bigIntByString = new BigInteger( bitString529.replace( " ", ""), 2); // throws if there is a blank!
    BigInteger bigIntByArr = new BigInteger( arr529);
    BitSet bitsetByArr = BitSet.valueOf( arr529);  // interpretes bit-order as LittleEndian, but byte-order as BigEndian !!!

    System.out.println( "bitString529     : " + bitString529);              // bitString529     : 00000010 00010001
    System.out.println( "arr529.toString  : " + Arrays.toString( arr529));  // arr529.toString  : [2, 17]
    System.out.println( "bigIntByString   : " + bigIntByString);            // bigIntByString   : 529
    System.out.println( "bigIntByArr      : " + bigIntByArr);               // bigIntByArr      : 529
    System.out.println( "bitsetByArr      : " + bitsetByArr.toString() );   // bitsetByArr      : {1, 8, 12}
    System.out.println( "expecting        : {0, 4, 9}");                    // expecting        : {0, 4, 9}

    String bigIntByStringStr = toBitString( bigIntByString::testBit);
    String bigIntByArrStr = toBitString( bigIntByArr::testBit);
    String bitsetByArrStr = toBitString( bitsetByArr::get);

    System.out.println( "bigIntByStringStr: " + bigIntByStringStr);         // bigIntByStringStr: 1000100001000000
    System.out.println( "bigIntByArrStr   : " + bigIntByArrStr);            // bigIntByArrStr   : 1000100001000000
    System.out.println( "bitsetByArrStr   : " + bitsetByArrStr );           // bitsetByArrStr   : 0100000010001000
}

private String toBitString( Function<Integer, Boolean> aBitTester)
{
    StringBuilder sb =  new StringBuilder();
    for ( int i = 0; i < 16; i++ )
    {
        sb.append( aBitTester.apply( i) ? "1" : "0");
    }
    return sb.toString();
}

which prooves that the BitSet parses byte arrays as BIG_ENDIAN whereas it interpretes the bit order (of a single byte) as LITTLE_ENDIAN. The BigInteger in contrast interpretes both in LITTLE_ENDIAN, even when loaded by a bit string.

Especially the iteration on the bit indices of the two classes (BitInteger::testBit vs. BitSet::get) delivers different results.

Is there a reason for this inconsistency?

Heri
  • 4,368
  • 1
  • 31
  • 51

1 Answers1

3

Endianess mostly refers only to the ordering of bytes, not to the ordering of the individual bits. The latter is not relevant in most applications, because you cannot for example address individual bits in memory. So endianness of bits in a byte is only used in cases where it is important, such as serial data buses, and otherwise bytes are typically treated as the numbers they represent without any endianness (cf. Wikipedia and the answers to Can endianness refer to bits order in a byte?).

So because of this BitSet treats bytes as having their least-significant bit first such that when you give it the byte 0x01 you get the expected result of having the lowest bit set, regardless of what endianness it uses for the ordering of the bytes. This is why your outputs using BigInteger and BitSet differ only in the order of the bytes.

Note that for the order of bytes, BitSet uses little-endianness, and BigInteger uses big endianness (different from what you claim).

As to why BitSet uses a different endianness than BigInteger, we can only speculate. Note that the respected BitSet methods are much newer (introduced only in Java 1.7), so maybe perceived importance of little- vs. big-endianness has changed since the BigInteger was introduced.

Community
  • 1
  • 1
Philipp Wendler
  • 11,184
  • 7
  • 52
  • 87
  • Sorry, I am always confused about the term BIG_ENDIAN. My above statement is based on the imagination that the most significant byte comes at the end. Obviously the common sense is the contrary and you are right. – Heri Apr 03 '17 at 09:37
  • But my question was: Is there a good reason why BitSet operates the other way round? – Heri Apr 03 '17 at 09:38
  • And yes, the bit ordering do matter in some cases. e.g. encoding a RSA modulus in a base64 string (splitting up the bitstream into 6 bit words), or the QR-Code encoding (splitting up the bitstream into 11 bit words). – Heri Apr 03 '17 at 09:49
  • @Heri Of course there are applications where bit order is important (I didn't claim otherwise), but they are relatively rare. I added a paragraph about why BitSet could use a different endianness than BigInteger. – Philipp Wendler Apr 03 '17 at 13:49