57

I have a legacy application that takes an integer, converts it to a binary string, reverses that string, and then gets the positions of bits (ones) as a list of integers. For example:

6 -> "110" -> "011" -> (2,3) 
7 -> "111" -> "111" -> (1,2,3)
8 -> "1000" -> "0001" -> (4)

What is a succinct and clear way to accomplish this in modern Java without the String operations? The conversion to and from String seems wasteful to me, and I know there's no simple way to flip a String (no String.reverse()) anyway.

WJS
  • 36,363
  • 4
  • 24
  • 39
workerjoe
  • 2,421
  • 1
  • 26
  • 49
  • 9
    I seriously ask myself: what for? All solutions will somehow check if there is an one and then store the index in some way. Maybe the String conversion is not needed, but by today's standards, I doubt that it matters much regarding performance. – maio290 May 11 '20 at 18:36
  • 6
    Note: `new StringBuilder(s).reverse().toString()` can be used to reverse string s. – David Conrad May 12 '20 at 13:37
  • 4
    @maio290 Using a string *at all* here is conceptually wonky. Performance has little to do with this (though it’s a consequence, and if this is a frequent operation, performance may actually become relevant). – Konrad Rudolph May 13 '20 at 09:00
  • @KonradRudolph I agree that it might not be a good concept. But even if the method is called a dozen times, the performance impact will still be negligible I guess. OP didn't state anything about any issues rather than coding style. And thus I think a refactoring is a bit overkill. But that's opinion based. – maio290 May 13 '20 at 11:54
  • 5
    What is the usecase for that? I can't think of an actual use for that. – inetphantom May 13 '20 at 11:58
  • 2
    @inetphantom As I said, legacy system. This is actually a very idiosyncratic "foreign key" in a database. The integer encodes a set of ID numbers to look up in a database table. – workerjoe May 13 '20 at 12:20
  • @KonradRudolph I disagree completely. An `int` conceptually is not a container that can be indexed or ordered, a `string` (or char array) is. The only reason to do bit twiddling here is if if there's a measured performance benefit that you're willing to trade away readability and formal consistency to achieve. – mintchkin Jun 01 '20 at 22:28
  • @mintchkin A string, despite its name, is a *text* storage type. Use it for one thing, and one thing only: text. Due to their purpose (= text!), strings have all these associates complications like encodings etc. — Also, you’re simply wrong about integers not being a “container” of bits: thinking of numbers as bit vectors is a widely established pattern. By all means encapsulate this in a dedicated type (similar to `EnumSet`), if you feel more comfortable. Just realise that `String` is the *wrong type*. – Konrad Rudolph Jun 02 '20 at 08:51
  • @KonradRudolph "thinking of numbers as bit vectors" is exactly as reasonable as "thinking of strings as bit vectors", the point is that the interface required here is a bit vector. If you want to argue that `String` is the wrong type, fine, but an `int` is the wrong type by the same measure, and in fact supports far fewer of the required "vector" operations than a string does. Taking performance out of the equation and speaking simply in terms of types which implement the required interface, you're much closer with a string than you are with an int. – mintchkin Jun 02 '20 at 14:25
  • @mintchkin But integers *have* a bit vector interface. What do you think bit operations are? And nothing prevents you from creating a more user-friendly interface. Using strings requires *conceptually* many more hoops because before you can address the bits (as characters in some encoding) you first need to *convert* the integer into a string representation in binary base using a nontrivial algorithm. You just brush aside these complexities to argue that, *once most of the conceptual work is already done*, something is easier. You can’t just skip steps 1–99. – Konrad Rudolph Jun 02 '20 at 14:32

13 Answers13

58

Just check the bits in turn:

List<Integer> bits(int num) {
  List<Integer> setBits = new ArrayList<>();
  for (int i = 1; num != 0; ++i, num >>>= 1) {
    if ((num & 1) != 0) setBits.add(i);
  }
  return setBits;
}

Online Demo

6 [2, 3]
7 [1, 2, 3]
8 [4]
Andy Turner
  • 137,514
  • 11
  • 162
  • 243
  • Looks good. I never use these binary/bitwise operations so I can never remember them. – workerjoe May 11 '20 at 18:49
  • 5
    Obviously :-) Doing this with strings is just awful. – TonyK May 13 '20 at 01:00
  • Lots of great answers to this question! I thought the iteration in this one, with the shift-right operator, was quite clever. Also it was one of the earliest answers. – workerjoe May 13 '20 at 15:51
  • 2
    @workerjoe: You picked right; this one should JIT-compile most efficiently (except for [@Matthie M.'s answer](https://stackoverflow.com/a/61754871/224132) that loops over only the set bits, especially on inputs with only a few bits set sparsely across the whole number.) This is pretty cleanly written / easy to understand using simple bitwise operations. Right-shifting gives you a cheap early-out and is often more efficient than testing `num & (1< – Peter Cordes May 16 '20 at 00:14
31

You can just test the bits without turning the integer into a string:

List<Integer> onePositions(int input) {
  List<Integer> onePositions = new ArrayList<>();
  for (int bit = 0; bit < 32; bit++) {
    if (input & (1 << bit) != 0) {
      onePositions.add(bit + 1); // One-based, for better or worse.
    }
  }
  return onePositions;
}

Bits are usually counted from right to left, the rightmost bit being bit 0. The operation 1 << bit gives you an int whose bit numbered bit is set to 1 (and the rest to 0). Then use & (binary and) to check if this bit is set in the input, and if so, record the position in the output array.

Thomas
  • 174,939
  • 50
  • 355
  • 478
  • 1
    I doubt the performance is substantially impacted by always checking all 32 bits; but you could do `maxBit = Integer.highestOneBit(input)`, and then use `bit <= maxBit` as the loop guard (and start from `bit = Integer.lowestOneBit(input)`, for that matter). – Andy Turner May 12 '20 at 12:06
  • There need to be parentheses around `input & (1 << bit)`, because `!=` has higher precedence than `&`. (which is a bug in Java in my opinion, since this has no uses). I think this answer is the best, because it is the most readable. You could pass `Integer.bitCount(input)` to the constructor of ArrayList as done in Matthieu's answer to improve performance a bit. – user42723 May 14 '20 at 11:15
  • It's generally better to right-shift the number being tested than to left-shift a `1`. First of all, it avoids any variable-count shifts which can have advantages (e.g. on modern Intel CPUs that's 3 uops instead of 1 if the JIT compiler doesn't optimize it to a `bt` instruction.) Second it gives you an early-out for free when `input >>>= 1` becomes 0. Having a right shift as part of a loop-carried dependency chain is fine; modern CPUs have single-cycle latency shifts. – Peter Cordes May 14 '20 at 18:49
  • @PeterCordes Thanks, that's deep and useful info, and the accepted answer by Andy Turner does this perfecly. My answer is probably more readable and more educational at a high level, so I'm going to leave it as is. – Thomas May 15 '20 at 08:21
26

May I propose a pure bit-wise solution?

static List<Integer> onesPositions(int input)
{
    List<Integer> result = new ArrayList<Integer>(Integer.bitCount(input));

    while (input != 0)
    {
        int one = Integer.lowestOneBit(input);
        input = input - one;
        result.add(Integer.numberOfTrailingZeros(one));
    }

    return result;
}

This solution is algorithmically optimal:

  1. Single memory allocation, using Integer.bitCount to appropriately size the ArrayList in advance.
  2. Minimum number of loop iterations, one per set bit1.

The inner loop is rather simple:

  • Integer.lowestOneBit returns an int with only the lowest bit of the input set.
  • input - one "unset" this bit from the input, for next iteration.
  • Integer.numberOfTrailingZeros count the number of trailing zeros, in binary, effectively giving us the index of the lowest 1 bit.

1 It is notable that this may not be the most optimal way once compiled, and that instead an explicit 0..n loop based on the bitCount would be easier to unroll for the JIT.

Matthieu M.
  • 287,565
  • 48
  • 449
  • 722
  • You don't need to isolate the lowest set bit before CTZ. Instead you can `add CTZ(input)` / `blsr input`, i.e. clear the lowest set bit using `input &= input - 1;`. Even if a JITter doesn't use x86 BMI2 [`blsr`](https://www.felixcloutier.com/x86/blsr), it's easy to implement with a couple instructions. Or 3 if JIT still fails to use LEA peephole optimizations, and the final AND leaves FLAGS set according to `input`, saving a cmp/test for the loop branch. Unless HotSpot JIT screws that up, too. Hopefully HotSpot knows that `tzcnt` or `bsf` have an output dependency... – Peter Cordes May 14 '20 at 18:52
  • Correction, `blsr` is BMI1, not BMI2. – Peter Cordes May 14 '20 at 19:07
  • 1
    @PeterCordes `Integer.lowestOneBit(…)` has been implemented as `return i & -i;`. Whether the JIT understands the idiom or not, calling this method will never be worse than doing `input&=input-1;` manually. – Holger Jul 07 '20 at 10:30
  • @Holger: That's true on its own, but my suggestion *also* saves the `input = input - one;`. This shortens the latency of the loop-carried dependency chain. (to 1 cycle instead of 2 with good JIT using BMI1 `blsr` instead of `blsi` / `sub`, or just by 1 if the JIT isn't that smart) . – Peter Cordes Jul 07 '20 at 10:41
  • @Holger: Also, doing `i & -i;` could easily be worse than `i & (i-1)` on x86-64, depending on the JIT compiler. `i-1` can be computed into a separate register with `lea eax, [rdi-1]` for example, but computing `-i` without destroying the original `i` generally requires `mov`+`neg`, or xor-zeroing a register to `sub` from. (x86-64 doesn't have any addressing modes that subtract a register from anything). So without BMI1 peepholes (and it being available at all), `i & (i-1)` could be cheaper to compute for a compiler that can use LEA. (IDK if current JVMs do or not.) – Peter Cordes Jul 07 '20 at 10:46
22

Since you wrote "modern Java", this is how it can be done with streams (Java 8 or better):

final int num = 7;

List<Integer> digits = IntStream.range(0,31).filter(i-> ((num & 1<<i) != 0))
        .map(i -> i+1).boxed().collect(Collectors.toList());

The map is only needed since you start counting at 1 and not at 0.

Then

System.out.println(digits);

prints

[1, 2, 3]
Laurel
  • 5,965
  • 14
  • 31
  • 57
Christian Fries
  • 16,175
  • 10
  • 56
  • 67
16

I would definitely prefer Andy's answer myself, even if it seems cryptic at first. But since no one here has an answer with streams yet (even if they are totally out of place here):

public List<Integer>  getList(int x) {
    String str = Integer.toBinaryString(x);
    final String reversed = new StringBuilder(str).reverse().toString();
    return IntStream.range(1, str.length()+1)
            .filter(i -> reversed.charAt(i-1)=='1')
            .boxed()
            .collect(Collectors.toList());
}
Eritrean
  • 15,851
  • 3
  • 22
  • 28
12

A silly answer, just for variety:

BitSet bs = BitSet.valueOf(new long[] {0xFFFFFFFFL & input});
List<Integer> setBits = new ArrayList<>();
for (int next = -1; (next = bs.nextSetBit(next + 1)) != -1;) {
  setBits.add(next + 1);
}

(Thanks to pero_hero for pointing out the masking was necessary on WJS's answer)

Andy Turner
  • 137,514
  • 11
  • 162
  • 243
11

Given the original integer returns a list with the bit positions.

static List<Integer> bitPositions(int v) {
     return BitSet.valueOf(new long[]{v&0xFF_FF_FF_FFL})
                .stream()
                .mapToObj(b->b+1)
                .collect(Collectors.toList());
}

Or if you want to do bit shifting.

static List<Integer> bitPositions(int v ) {
    List<Integer> bits  = new ArrayList<>();
    int pos = 1;
    while (v != 0) {
        if ((v & 1) == 1) {
            bits.add(pos);
        }
        pos++;
        v >>>= 1;
    }
    return bits;

}

WJS
  • 36,363
  • 4
  • 24
  • 39
9

You don't need to reverse the actual binary string. You can just calculate the index.

String str = Integer.toBinaryString(num);
int len = str.length();
List<Integer> list = new ArrayList<>();
for (int i=0; i < len; i ++) {
  if (str.charAt(i) == '1') list.add(len - 1 - i);
}
user
  • 7,435
  • 3
  • 14
  • 44
7

just for fun:

Pattern one = Pattern.compile("1");
List<Integer> collect = one.matcher(
             new StringBuilder(Integer.toBinaryString(value)).reverse())
            .results()
            .map(m -> m.start() + 1)
            .collect(Collectors.toList());
System.out.println(collect);
pero_hero
  • 2,881
  • 3
  • 10
  • 24
7

a stream version of @Matthieu M. answer:

 List<Integer> list = IntStream.iterate(value, (v) -> v != 0, (v) -> v & (v - 1))
                .mapToObj(val -> Integer.numberOfTrailingZeros(val) + 1)
                .collect(toList());
pero_hero
  • 2,881
  • 3
  • 10
  • 24
  • 1
    A simpler idiom for clearing the lowest set bit is `v & (v-1)`. An easy place to look up the bithacks for isolating or resetting the lowest set bit is docs for the x86 assembly BMI instructions [`blsr`](https://www.felixcloutier.com/x86/blsr#operation) and [`blsi`](https://www.felixcloutier.com/x86/blsi#operation). I frequently double-check those instead of memorizing the actual bithack formula. – Peter Cordes May 14 '20 at 19:06
  • @PeterCordes Thanks for the hint. I was trying to come up with a simpler form, but I have not been able to deliver. Now that I know, it is rather simple. – pero_hero May 14 '20 at 20:22
6

You can use this solution:

    static List<Integer> convert(int input) {
        List<Integer> list = new ArrayList<>();
        int counter = 1;
        int num = (input >= 0) ? input : Integer.MAX_VALUE + input + 1;
        while (num > 0) {
            if (num % 2 != 0) {
                list.add(counter);
            }
            ++counter;
            num /= 2;
        }
        return list;
    }

It outputs:

[2, 3]
[1, 2, 3]
[4]
0xh3xa
  • 4,801
  • 2
  • 14
  • 28
  • 1
    There's a couple of bugs in here. Try -1 as an input. – Andy Turner May 11 '20 at 18:56
  • What do you have when try 1?, I got 1 – 0xh3xa May 11 '20 at 19:01
  • Use `input & 1` if you want to test a bit. `-1 % 2` is `-1` in Java, C, and C++. – Peter Cordes May 14 '20 at 19:00
  • @PeterCordes Good observation. I have addressed that, thanks! – 0xh3xa May 14 '20 at 19:35
  • This still looks wrong. Now you're toggling the high bit for negative inputs but not adding `32` to the list. There's no reason to special-case that bit in the first place, just use logical right shift `>>>` to do unsigned division by 2. (i.e. use [@Andy's answer](https://stackoverflow.com/questions/61736649/in-java-how-to-get-positions-of-ones-in-reversed-binary-form-of-an-integer/61736789#61736789)) – Peter Cordes May 14 '20 at 19:44
6

or if you want:

String strValue = Integer.toBinaryString(value);
List<Integer> collect2 = strValue.codePoints()
           .collect(ArrayList<Integer>::new,
                   (l, v) -> l.add(v == '1' ? strValue.length() - l.size() : -1), 
                   (l1, l2) -> l1.addAll(l2)).stream()
           .filter(e -> e >= 0)
           .sorted()
           .collect(toList());
pero_hero
  • 2,881
  • 3
  • 10
  • 24
5

Just use the indexOf function of the String class

public class TestPosition {
    public static void main(String[] args) {
        String word = "110"; // your string
        String guess = "1"; // since we are looking for 1
        int totalLength = word.length();
        int index = word.indexOf(guess);
        while (index >= 0) {
            System.out.println(totalLength - index);
            index = word.indexOf(guess, index + 1);
        }
    }
}
Ajay Kr Choudhary
  • 1,304
  • 1
  • 14
  • 23
  • The initial input is an `int`, not a String, that's why it's so inefficient to convert it to a base-2 String in the first place. – Peter Cordes May 14 '20 at 18:59