0

In my program I have a function that takes a byte that is always a power of two (never zero) and returns the position of the 1 (high bit) as an integer.

e.g. f(0010 0000) ->  2, f(0000 0001) -> 7 

This C program has to run this function millions of times so I need it to be very fast. I've written two implementations for this function, and I don't really know which is faster.

int f(unsigned char bit) {
    //ln2 is the log base e of two. Basically I'm doing 7 minus log base 2 of the input
    return (int)(7 - round(log(bit) / ln2));
} 

int f(unsigned char bit) {
    if (bit == 0x00) /**/ return 0;
    else if (bit == 0x01) return 7;
    else if (bit == 0x02) return 6;
    else if (bit == 0x04) return 5;
    else if (bit == 0x08) return 4;
    else if (bit == 0x10) return 3;
    else if (bit == 0x20) return 2;
    else if (bit == 0x40) return 1;
    else if (bit == 0x80) return 0;
}

I have no computer science education, I'm just a hobbyist so a lot of these problems are hard for me to figure out on my own. I figure that the log() function is slow, but I know that if statements take cycles and branch prediction can cause stuff to slow down. I honestly don't know if that is correct though, just guessing.

Could anyone provide insight for me on which is faster and why? Or if you have an alternative, even better way I'm open to suggestions! Thanks!

Ashwin Gupta
  • 2,159
  • 9
  • 30
  • 60
  • 1
    Your lookup-table version will be considerably faster if you use a `switch` statement as that will be compiled into a static lookup table. – Dai Aug 30 '17 at 03:28
  • It's *possible* that an implementation of `log2(char)` will use a lookup-table (perhaps even in hardware) but that is not guaranteed by the C standard library specification. But division is always expensive, even for integers. – Dai Aug 30 '17 at 03:29
  • @Dai thanks, that is very useful info! I'm about 1/4 into a run with the if-statements so I'll change those to switch once this run finishes and track time. Unfortunately runs take like 2 hours :( LOL – Ashwin Gupta Aug 30 '17 at 03:32
  • Well, you could always measure it which is the sanest way to answer this kind of question but almost anything is going to be faster than log2 and turning a bit fiddling problem into a floating point one. – pvg Aug 30 '17 at 03:35
  • @pvg yes I'm measuring them now, but since the program takes 2 hours to run those take a while. I was looking for a more "scientific" explanation that explains the underlying theory or reason why like what Dai explained about switch/lookup table. Since I teach myself I do a lot of the practical "engineering" tests but miss out on the actual "computer science". – Ashwin Gupta Aug 30 '17 at 03:37
  • The conversion to/from float is probably actually the most expensive part. – o11c Aug 30 '17 at 03:38
  • Use this site as your reference for everything: http://graphics.stanford.edu/~seander/bithacks.html – o11c Aug 30 '17 at 03:38
  • @o11c really?! Fascinating, does the conversion require movement in memory of some sort? I figured it was just changing the way C represents the number. I should probably read up on typecasting. Thanks for the link. – Ashwin Gupta Aug 30 '17 at 03:39
  • 1
    You can measure just the thing you're asking about, that's not going to take two hours. The problem itself - finding the single bit that's set in a value is well googleable, btw, lots of decent solutions. Just avoid log2 for this. – pvg Aug 30 '17 at 03:39
  • @pvg fair point. I'll write a sort of unit test type thing for it also. Thanks for the suggestion. – Ashwin Gupta Aug 30 '17 at 03:40
  • 1
    Either it's important to you. Then you write two versions and measure their speed. Or it's not important. Then its not important. – gnasher729 Aug 30 '17 at 03:54
  • @AshwinGupta In short, float registers and integer registers are completely separate units in the CPU, and it is always expensive to move things between units. – o11c Aug 30 '17 at 03:54
  • 2 Hours algorithm and you bother with the speed of bit position finding? Did you actually measure whether this has any significant performance impact? If you did measure it, you should mention it in the question... otherwise people tend to assume you didn't measure. – grek40 Aug 30 '17 at 06:28
  • @grek40 This part of the program is totally independent of the other part in the other question. This function is run AFTER the primes are computed. I've been testing both halves individually. – Ashwin Gupta Aug 30 '17 at 06:30

0 Answers0