0

Base on the answers i've got, i think this problem is kind of meaningless. Thanks for all your kind replies!

i want to get a binary number with its rightmost j bits set to 1 and others set to be 0. basically, there are two methods. i wanna know which of them is more efficient, or is there a more efficient way than these two?

1. ~(~0 << j)
2. (1 << j) - 1
Lion
  • 965
  • 10
  • 21

8 Answers8

3

Not sure if it's the answer you're looking for, but I'll bet it won't make more than a nanosecond of difference. :)

Or, to put it another way: Don't micro-optimize it unless that one-liner is the bottleneck in your code.

If you need other forms of fast bit manipulation that might actually be slower, try looking at the compiler intrinsic functions, like _BitScanForward. Those might actually make your bit operations faster, when used correctly (but not in a situation like this).

user541686
  • 205,094
  • 128
  • 528
  • 886
  • thank you! i was not sure whether these two methods had big differences in performance, because the latter one uses "-" operation, which isn't a bit-manipulation. so i asked this question here. – Lion Jan 07 '11 at 03:10
  • Haha you're welcome. :) Honestly, there's bigger things to worry about when you're coding; if you've just started programming, it's easy to fall into traps like this, particularly if you started with a lower-level language. Don't. If you're **really** worried, then get a book like [ths](http://www.amazon.com/Computer-Organization-Design-Fourth-ebook/dp/B004HHOC7E) (written by our professor :] ) and start reading up on things like pipelining, locality of memory, etc. Then you'll learn how the CPU works, and you'll learn when things like this actually make a difference, and when they don't. – user541686 Jan 07 '11 at 03:12
2

You are micro-optimising. Your compiler knows the best translation for those operations. Just write the one that looks the cleanest to the human eye, and move on.

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
1

In addition to the comments already posted:

In addition to benchmarking, examine the assembler that's emitted. The optimiser might have produced the same code for each....

Mitch Wheat
  • 295,962
  • 43
  • 465
  • 541
1

This is sort of a lazy answer, but have you tried writing a trivial program like the following? Sure it is micro-optimizing, but it might be fun and interesting to see if there is any difference.

#include <ctime>
main()
{
  int i;
  time_t start = time();
  for (i = 0; i < 1000000; i++)
  {
    // your operation here
  }
  time_t stop = time();
  double elapsed = difftime(stop, start);
}
mtjhax
  • 1,988
  • 1
  • 18
  • 23
1

If you really want the fastest, use a lookup table:

const unsigned numbers[] = {
        0x00000000,
        0x00000001, 0x00000003, 0x00000007, 0x0000000f,
        0x0000001f, 0x0000003f, 0x0000007f, 0x000000ff,
        0x000001ff, 0x000003ff, 0x000007ff, 0x00000fff,
        0x00001fff, 0x00003fff, 0x00007fff, 0x0000ffff,
        0x0001ffff, 0x0003ffff, 0x0007ffff, 0x000fffff,
        0x001fffff, 0x003fffff, 0x007fffff, 0x00ffffff,
        0x01ffffff, 0x03ffffff, 0x07ffffff, 0x0fffffff,
        0x1fffffff, 0x3fffffff, 0x7fffffff, 0xffffffff};

unsigned h(unsigned j) {
        return numbers[j];
}

Extending this to 64 bits is left as an exercise for the reader. And as others have said, none of this matters.

Josh Lee
  • 171,072
  • 38
  • 269
  • 275
  • 1
    I have a hard time believing your lookup table will be faster. In most practical cases it will probably be slower due to clobbering the L1 cache with a table full of useless trivially-computable values. – R.. GitHub STOP HELPING ICE Jan 07 '11 at 04:32
1

Unless you change 0 to 0U, the expression ~(~0 << j) has implementation-specific behavior based on bit patterns. On the other hand, the expression (1 << j) - 1 is purely arithmetic and has no bit arithmetic in it, so it's value is well-defined across all implementations. Therefore, I would always use the latter.

R.. GitHub STOP HELPING ICE
  • 208,859
  • 35
  • 376
  • 711
0

The true answer is probably dependent on processor architecture. But for all intents and purposes it's almost certainly irrelevant. Also consider that your complier may output the same assembly either way. If you truly need to know, benchmark it, although the difference will almost certainly be too small to measure (which is your answer, it doesn't matter).

Zack Bloom
  • 8,309
  • 2
  • 20
  • 27
0

No timing experiments necessary, just check the generated machine code. You'll find that gcc compiles them both to identical machine instructions.