How to fill high-end bits in a Java byte with '1' without knowing the last 1 in advance? (FAST FIX Negative Integer decoder)

Question

I am writing a FIX/FAST decoder for negative numbers as described below:

enter image description here

My question is:

How to fill the high-end bits of a Java byte with 1s as it is described above? I am probably unaware of some bit manipulation magic I need to in this conversion.

So I need to go from 01000110 00111010 01011101 to 11110001 10011101 01011101.

I know how to shift by 7 to drop the 8th bit. What I don't know is how to fill the high-end bits with 1s.

Mike Strobel · Accepted Answer · 2014-09-05T18:31:23.723

It seems like the question you're asking doesn't really match up with the problem you're trying to solve. You're not trying to fill in the high bits with 1; you're trying to decode a stop-bit-encoded integer from a buffer, which involves discarding the sign bits while combining the payload bits. And, of course, you want to stop after you find a byte with a 1 in the stop bit position. The method below should decode the value correctly:

private static final byte SIGN_BIT = (byte)0x40;
private static final byte STOP_BIT = (byte)0x80;
private static final byte PAYLOAD_MASK = 0x7F;

public static int decodeInt(final ByteBuffer buffer) {
    int value = 0;
    int currentByte = buffer.get();

    if ((currentByte & SIGN_BIT) > 0)
        value = -1;

    value = (value << 7) | (currentByte & PAYLOAD_MASK);
    if ((currentByte & STOP_BIT) != 0)
        return value;

    currentByte = buffer.get();
    value = (value << 7) | (currentByte & PAYLOAD_MASK);
    if ((currentByte & STOP_BIT) != 0)
        return value;

    currentByte = buffer.get();
    value = (value << 7) | (currentByte & PAYLOAD_MASK);
    if ((currentByte & STOP_BIT) != 0)
        return value;

    currentByte = buffer.get();
    value = (value << 7) | (currentByte & PAYLOAD_MASK);
    if ((currentByte & STOP_BIT) != 0)
        return value;

    currentByte = buffer.get();
    value = (value << 7) | (currentByte & PAYLOAD_MASK);
    return value;
}

A loop would be cleaner, but I unrolled it manually since messaging protocols tend to be hot code paths, and there's a fixed maximum byte length (5 bytes). For simplicity's sake, I read the bytes from a ByteBuffer, so you may need to adjust the logic based on how you're reading the encoded data.

That's great, but just curious: why do you think a loop would be worse? You are repeating the `currentByte & PAYLOAD_MASK` every time when it only needs to be done at the very last byte. A loop would allow you to detect that... — chrisapotek, Sep 05 '14 at 18:54
This code comes from a production system and underwent extensive performance testing. Several variations were tested, and this one consistently yielded the best performance. Branching instructions tend to be expensive, while bit manipulation is cheap, so better to avoid conditionals. — Mike Strobel, Sep 05 '14 at 19:01
I do believe you! But that does not make much sense because you are checking the condition `(currentyByte & STOP_BIT) != 0` multiple times like a loop would do. So you are saying that the loop itself is more expensive? (again just trying to make sense of what you claim) — chrisapotek, Sep 05 '14 at 19:08
I'm saying that the looping versions consistently performed worse *for us*. That doesn't mean much. Actual performance depends on many factors, including hardware architecture and inlining (which can depend on your most common call stacks). Feel free to use a looping version if that's what you prefer. It was easier to post this version since I had the source handy. — Mike Strobel, Sep 05 '14 at 19:16

score 1 · Answer 2 · answered Sep 05 '14 at 16:01

Fillig the high bits might go as:

int fillHighBits(int b) {             // 0001abcd
    int n = Integer.highestOneBit(b); // 00010000
    n = ~n;                           // 11101111
    ++n;                              // 11110000
    return (n | b) 0xFF;              // 1111abcd
}

As expression

(~Integer.highestOneBit(b) + 1) | b

Though the examples you gave lets me doubt this is what you want.

score 1 · Answer 3 · answered Sep 05 '14 at 17:04

This can be done very simply using a simple accumulator where you shift in 7 bits at a time. You need to keep track of how many bits you have in the accumulator.

Sign extension can be performed by simple logical shift left follwed by arithmetic shift right (by the same distance) to copy the topmost bit to all unused positions.

byte[] input = new byte[] { 0x46, 0x3A, (byte) 0xDD };
int accumulator = 0;
int bitCount = 0;
for (byte b : input) {
    accumulator = (accumulator << 7) | (b & 0x7F);
    bitCount += 7;
}
// now sign extend the bits in accumulator
accumulator <<= (32 - bitCount);
accumulator >>= (32 - bitCount);
System.out.println(Integer.toHexString(accumulator));

The whole trick is that >>N operator replicates the top bit N times.

@chrisapotek Uhm 32 happens to be the number of bits a int contains. You could write Integer.SIZE to make it clear beyond doubt that the number of bits in the int type is meant. Writing just 32 is a (bad, but overwhelmingly widely spread) habbit from the past. — Durandal, Sep 05 '14 at 17:51
I am sorry. (Shamed!) An integer bigger than 32 bits will be hard to find :) — chrisapotek, Sep 05 '14 at 17:54

score 0 · Answer 4 · answered Sep 05 '14 at 15:07

0

do logical OR (|) with a number which has highend bits set to 1 and rest are 0

for example:

   1010101010101010
OR 1111111100000000
--------------------
   11111111101010101

answered Sep 05 '14 at 15:07

jmj

237,923
42
401
438

I don't know what the last bit is, that's my problem, no? How do I find out what to mask? The last 1 can be anywhere! – chrisapotek Sep 05 '14 at 15:18
well then iterate through bit sequence `Integer.toBinaryString(num)` and generate mask – jmj Sep 05 '14 at 15:48

score 0 · Answer 5 · answered Sep 05 '14 at 15:09

0

something like this:

int x = ...; x = x | 0xF000;

answered Sep 05 '14 at 15:09

ControlAltDel

33,923
10
53
80

How to fill high-end bits in a Java byte with '1' without knowing the last 1 in advance? (FAST FIX Negative Integer decoder)

5 Answers5