which bit-manipulation method is more efficient in C?

Question

Base on the answers i've got, i think this problem is kind of meaningless. Thanks for all your kind replies!

i want to get a binary number with its rightmost j bits set to 1 and others set to be 0. basically, there are two methods. i wanna know which of them is more efficient, or is there a more efficient way than these two?

1. ~(~0 << j)
2. (1 << j) - 1

If you can't test it reasonably then it probably doesn't matter anyway. (You only optimize if it makes a difference, and if it doesn't make a difference, then what are you optimizing in the first place?) — user541686, Jan 07 '11 at 03:09
Read my answer. The two are not the same, and there are very good reasons to prefer #2 over #1. — R.. GitHub STOP HELPING ICE, Jan 07 '11 at 04:34

score 3 · Accepted Answer · answered Jan 07 '11 at 03:04

3

Not sure if it's the answer you're looking for, but I'll bet it won't make more than a nanosecond of difference. :)

Or, to put it another way: Don't micro-optimize it unless that one-liner is the bottleneck in your code.

If you need other forms of fast bit manipulation that might actually be slower, try looking at the compiler intrinsic functions, like _BitScanForward. Those might actually make your bit operations faster, when used correctly (but not in a situation like this).

answered Jan 07 '11 at 03:04

user541686

205,094
128
528
886

thank you! i was not sure whether these two methods had big differences in performance, because the latter one uses "-" operation, which isn't a bit-manipulation. so i asked this question here. – Lion Jan 07 '11 at 03:10
Haha you're welcome. :) Honestly, there's bigger things to worry about when you're coding; if you've just started programming, it's easy to fall into traps like this, particularly if you started with a lower-level language. Don't. If you're **really** worried, then get a book like [ths](http://www.amazon.com/Computer-Organization-Design-Fourth-ebook/dp/B004HHOC7E) (written by our professor :] ) and start reading up on things like pipelining, locality of memory, etc. Then you'll learn how the CPU works, and you'll learn when things like this actually make a difference, and when they don't. – user541686 Jan 07 '11 at 03:12

score 2 · Answer 2 · answered Jan 07 '11 at 03:05

2

You are micro-optimising. Your compiler knows the best translation for those operations. Just write the one that looks the cleanest to the human eye, and move on.

answered Jan 07 '11 at 03:05

Lightness Races in Orbit

378,754
76
643
1,055

score 1 · Answer 3 · answered Jan 07 '11 at 03:07

1

In addition to the comments already posted:

In addition to benchmarking, examine the assembler that's emitted. The optimiser might have produced the same code for each....

answered Jan 07 '11 at 03:07

Mitch Wheat

295,962
43
465
541

then i need to learn some simple assembly language instructions to figure it out. – Lion Jan 07 '11 at 03:13

score 1 · Answer 4 · answered Jan 07 '11 at 03:11

1

This is sort of a lazy answer, but have you tried writing a trivial program like the following? Sure it is micro-optimizing, but it might be fun and interesting to see if there is any difference.

#include <ctime>
main()
{
  int i;
  time_t start = time();
  for (i = 0; i < 1000000; i++)
  {
    // your operation here
  }
  time_t stop = time();
  double elapsed = difftime(stop, start);
}

answered Jan 07 '11 at 03:11

mtjhax

1,988
1
18
23

`time` normally has only second resolution, so you'd need to nest 2 for loops with 1000000 iterations each to have any hope of measuring the performance. – R.. GitHub STOP HELPING ICE Jan 07 '11 at 04:33

score 1 · Answer 5 · answered Jan 07 '11 at 03:16

If you really want the fastest, use a lookup table:

const unsigned numbers[] = {
        0x00000000,
        0x00000001, 0x00000003, 0x00000007, 0x0000000f,
        0x0000001f, 0x0000003f, 0x0000007f, 0x000000ff,
        0x000001ff, 0x000003ff, 0x000007ff, 0x00000fff,
        0x00001fff, 0x00003fff, 0x00007fff, 0x0000ffff,
        0x0001ffff, 0x0003ffff, 0x0007ffff, 0x000fffff,
        0x001fffff, 0x003fffff, 0x007fffff, 0x00ffffff,
        0x01ffffff, 0x03ffffff, 0x07ffffff, 0x0fffffff,
        0x1fffffff, 0x3fffffff, 0x7fffffff, 0xffffffff};

unsigned h(unsigned j) {
        return numbers[j];
}

Extending this to 64 bits is left as an exercise for the reader. And as others have said, none of this matters.

I have a hard time believing your lookup table will be faster. In most practical cases it will probably be slower due to clobbering the L1 cache with a table full of useless trivially-computable values. — R.. GitHub STOP HELPING ICE, Jan 07 '11 at 04:32

score 1 · Answer 6 · answered Jan 07 '11 at 04:31

Unless you change 0 to 0U, the expression ~(~0 << j) has implementation-specific behavior based on bit patterns. On the other hand, the expression (1 << j) - 1 is purely arithmetic and has no bit arithmetic in it, so it's value is well-defined across all implementations. Therefore, I would always use the latter.

score 0 · Answer 7 · answered Jan 07 '11 at 03:07

The true answer is probably dependent on processor architecture. But for all intents and purposes it's almost certainly irrelevant. Also consider that your complier may output the same assembly either way. If you truly need to know, benchmark it, although the difference will almost certainly be too small to measure (which is your answer, it doesn't matter).

score 0 · Answer 8 · answered Jan 07 '11 at 04:16

0

No timing experiments necessary, just check the generated machine code. You'll find that gcc compiles them both to identical machine instructions.

answered Jan 07 '11 at 04:16

Billy O'Connor

181
4

which bit-manipulation method is more efficient in C?

8 Answers8