7

I have a quick question, suppose I have the following code and it's repeated in a simliar way 10 times for example.

if blah then
    number = number + 2^n
end if

Would it be faster to evaluate:

number = number + blah*2^n?

Which also brings the question, can you multiply a boolean value times a integer (Although I am not sure the type that is returned from 2^n, is it an integer or unsigned..etc)? (I'm working in Ada, but let's try to generalize this maybe?)

EDIT: Sorry I should clarify I am looking at 2 to the power of n, and I put c in there cause I was interested for my own learning in the future if I ever run into this problem in c and I think there are more c programmers out there on these boards then Ada (I'm assuming and you know what that means), however my current problem is in the Ada language, but the question should be fairly language independent (I hope).

technosaurus
  • 7,676
  • 1
  • 30
  • 52
onaclov2000
  • 5,741
  • 9
  • 40
  • 54
  • 4
    The caret ^ operator means XOR in C, so just keep that in mind. – chrisaycock Oct 26 '10 at 13:39
  • 2
    Be careful. Since C does not have a built in boolean type, there is no guarantee that blah is equal to either 1 or 0. Some functions which return true or false might choose to return something other than 1 in place of true. – Brian Oct 26 '10 at 13:53
  • 1
    @Brian Thanks I didn't realize that boolean doesn't always mean 0 and 1, that could have ended up being a hard lesson to learn. – onaclov2000 Oct 26 '10 at 14:00
  • There are not a few Ada programmers hanging around StackOverflow, and we've pretty much all got RSS feeds (or something comparable) set up to watch for the 'Ada' tag, so don't worry about an Ada programmer not noticing an Ada question :-) – Marc C Oct 26 '10 at 14:25
  • @Marc C Thanks I guess I made an assumption because I don't see nearly as many ada questions (however that might be my failure cause I don't look around a lot for it), so I just assumed that the more questions of one type there were, brought more of those type of programmers around, my bad, thank's for enlightening me :) – onaclov2000 Oct 26 '10 at 14:28
  • 1
    @Marc C - That's pretty slick. I'm just checking by hand. He's right that this is really a language-independent question though. The only wrinkle Ada adds is that its compilers have more information to allow for an even better job optimizing. So what is true for C (don't micro-optimize like this) is even more true for Ada. – T.E.D. Oct 26 '10 at 14:45
  • @Brian: C has a boolean type, `_Bool`, aliased to `bool` in `stdbool.h`. Also, while any nonzero value counts as true for conditionals, the result of a boolean expression (comparison operators, logical and/or/not operators, ...) is always 0 or 1. – R.. GitHub STOP HELPING ICE Oct 26 '10 at 15:23
  • By the way, I hope you realize `blah*2^n` is `blah< – R.. GitHub STOP HELPING ICE Oct 26 '10 at 15:26
  • @R..: You're right, C99 did introduce an explicit boolean type. I'd still be careful, though. I actually have had this issue bite me before, particularly when interacting between code across different programming languages. – Brian Oct 26 '10 at 16:42

10 Answers10

10

There is no general answer to such a question, this depends a lot on your compiler and CPU. Modern CPU have conditional move instructions, so everything is possible.

The only ways to know here are to inspect the assembler that is produced (usually -S as compiler option) and to measure.

Jens Gustedt
  • 76,821
  • 6
  • 102
  • 177
8

if we are talking about C and blah is not within your control, then just do this:

if(blah) number += (1<<n);

There is really not a boolean in C and does not need to be, false is zero and true is not zero, so you cannot assume that not zero is 1 which is what you would need for your solution, nor can you assume that any particular bit in blah is set, for example:

number += (blah&1)<<n;

Would not necessarily work either because 0x2 or 0x4 or anything non-zero with bit zero clear is considered a true. Typically you will find 0xFFF...FFFF (minus one, or all ones) used as true, but you cannot rely on typical.

Now, if you are in complete control over the value in blah, and keep it strictly to a 0 for false and 1 for true then you could do what you were asking about:

number += blah<<n;

And avoid the potential for a branch, extra cache line fill, etc.

Back to the generic case though, taking this generic solution:

unsigned int fun ( int blah, unsigned int n, unsigned int number )
{
    if(blah) number += (1<<n);
    return(number);
}

And compiling for the two most popular/used platforms:

    testl   %edi, %edi
    movl    %edx, %eax
    je  .L2
    movl    $1, %edx
    movl    %esi, %ecx
    sall    %cl, %edx
    addl    %edx, %eax
.L2:

The above uses a conditional branch.

The one below uses conditional execution, no branch, no pipeline flush, is deterministic.

  cmp   r0,#0
  movne r3,#1
  addne r2,r2,r3,asl r1
  mov   r0,r2
  bx    lr

Could have saved the mov r0,r2 instruction by re-arranging the arguments in the function call, but that is academic, you wouldnt burn a function call on this normally.

EDIT:

As suggested:

unsigned int fun ( int blah, unsigned int n, unsigned int number )
{
    number += ((blah!=0)&1)<<n;
    return(number);
}
  subs  r0, r0, #0
  movne r0, #1
  add   r0, r2, r0, asl r1
  bx    lr

Certainly cheaper, and the code looks good, but I wouldnt make assumptions that the result of blah!=0, which is zero or whatever the compiler has defined as true always has the lsbit set. It doesnt have to have that bit set for the compiler to generate working code. Perhaps the standards dictate the specific value for true. by re-arranging the function parameters the if(blah) number +=... will also result in three single clock instructions and not have assumptions.

EDIT2:

Looking at what I understand to be the C99 standard:

The == (equal to) and != (not equal to) operators are analogous to the relational operators except for their lower precedence. Each of the operators yields 1 if the specified relation is true and 0 if it is false.

Which explains why the above edit works and why you get the movne r0,#1 and not some other random number.

The poster was asking the question with regards to C but also noted that ADA was the current language, from a language independent perspective you should not assume "features" like the C feature above and use an if(blah) number = number + (1<<n). But this was asked with a C tag so the generically (processor independent) fastest result for C is, I think, number += (blah!=0)<<n; So Steven Wright's comment had it right and he should get credit for this.

The posters assumption is also basically correct, if you can get blah into a 0 or 1 form then using it in the math is faster in the sense that there is no branch. Getting it into that form without it being more expensive than a branch is the trick.

old_timer
  • 69,149
  • 8
  • 89
  • 168
  • 2
    What about `number += ((blah!=0)&1)< – Simon Wright Oct 27 '10 at 14:56
  • the result of blah!=0 is either 0 for false or true which is not deterministic. – old_timer Oct 27 '10 at 15:22
  • Reading an answer to a similar SO question the standard may dictate that the intermediate comparison does return a 1 or 0, if that is true and the compiler meets that standard everywhere then just do number += (blah!=0)< – old_timer Oct 28 '10 at 06:53
  • thanks, remember to give simon a +1 for his contribution. and pay it forward (help someone else on stackoverflow out) – old_timer Nov 01 '10 at 21:49
5

In Ada...

The original formulation:

if Blah then
  Number := Number + (2 ** N);
end if;

The alternative general formulation, assuming Blah is of type Boolean and Number and N are of suitable types:

Number := Number + (Boolean'pos(Blah) * (2 ** N));

(For N and Number of user-defined integer or floating point types, suitable definitions and type conversions may be required, the key point here is the Boolean'pos() construct, which Ada guarantees will give you a 0 or 1 for the predefined Boolean type.)

As for whether this is faster or not, I concur with @Cthutu:

I would keep it with the conditional. You shouldn't worry about low-level optimisation details at this point. Write the code that describes your algorithm best and trust your compiler.

Marc C
  • 8,664
  • 1
  • 24
  • 29
  • Nice on the pos part, I didn't think of that. If blah was a predictable value I could understand the compiler piece that both yourself and cthutu say, but since this is a discrete value coming in from a piece of hardware I'm not sure how the compiler can optimize this further, would you (or Cthutu) mind expanding? – onaclov2000 Oct 26 '10 at 14:26
  • Does Ada really guarantee 0 and 1 for the Boolean type? The only comment on this in the LRM is that False < True. However, I expect this to be the case with most (all?) compilers. And of course, the paranoid can define their own representation for a derived boolean type which guarantees 0 and 1 as the values :) – Schedler Oct 26 '10 at 14:39
  • Yes, for predefined Boolean this is guaranteed. It's because of the definition of the 'Pos attribute, which returns the *position number* of the enumeration, i.e. Boolean'Pos(False) is 0, Boolean'Pos(True) is 1. Even if the underlying representations were 0, and something-other-than-0, the 'Pos property would still hold (to get the actual representation you'd have to use an Unchecked_Conversion instantiation to get at it.) – Marc C Oct 26 '10 at 15:01
  • 1
    If the compiler generates a boolean value, it'll definitely have the appropriate 'Pos behaviour. However, if you generate a "boolean" value using some form of unchecked conversion (eg, importing from C) it may be invalid and most bets are off. For example, with the GCC Ada compiler, 42 can appear to be false in some contexts (because its LSB is clear), true in others, or result in Constraint_Error. As ever, if you're importing from a foreign context, validate at the interface. – Simon Wright Oct 26 '10 at 15:51
  • & Simon: Thanks for the input. Now re-reading the LRM, this is clear. I confused 'Pos with the internal representation. Useful new information. – Schedler Oct 28 '10 at 14:47
4

I would keep it with the conditional. You shouldn't worry about low-level optimisation details at this point. Write the code that describes your algorithm best and trust your compiler. On some CPUs the multiplication is slower (e.g. ARM processors that have conditionals on each instruction). You could also use the ?: expression which optimises better under some compilers. For example:

number += (blah ? 2^n : 0);

If for some reason this little calculation is the bottleneck of your application after profiling then worry about low-level optimisation.

Cthutu
  • 8,713
  • 7
  • 33
  • 49
  • When doing code reviews for embedded systems, I usually look at written code from the perspective of with little tweaks can it be a little quicker, I'm not planning any kind of massive re-write, or weeks of time tweaking the little things, but just hopefully obvious little things, but perhaps the compiler takes care of this. Although I don't think it can optimize it out, as the data in the boolean is a discrete in this case so it's not known until runtime. – onaclov2000 Oct 26 '10 at 13:57
  • I would really recommend sticking with a boolean check if your logic is triggered when a condition is true, rather than using an integer/float and multiply it. It will be more obvious to you when you get back to your code in 6 months time. – Cthutu Oct 26 '10 at 14:00
  • Be very weary of tweaking for optimisation. You might be making your code harder to read, and worse still make it slower. Intuition is not always your best friend when it comes to optimisation. – Cthutu Oct 26 '10 at 14:01
  • based on the comment associated with the function that runs this code, I would be surprise to fail to read the code easily, but I definitely do see your point. I suppose that a quick way to see whether this is quicker or slower (even with the compiler setup) would be to run a similar "function" a bunch of times taking a ton of time measurements, and that should tell me on our particular system whether it's faster or slower. – onaclov2000 Oct 26 '10 at 14:31
  • I was trying to explain that you should not worry about optimisation at that level and I was describing a general approach, rather anything specific to the example code. Profile your code often and use that as a guide to where you should be focusing your optimisation efforts, should your app need it. – Cthutu Oct 26 '10 at 17:49
4

In C, regarding blah*2^n: Do you have any reason to believe that blah takes the values 0 and 1? The language only promises that 0 <-> FALSE and (everything else) <-> TRUE. C allows you to multiply a "boolean" temporary with another number, but the result is not defined except insofar as result=0 <=> the bool was false or the number was zero.

In Ada, regarding blah*2^n: The language does not define a multiplication operator on type Boolean. Thus blah cannot be a bool and be multiplied.

Eric Towers
  • 4,175
  • 1
  • 15
  • 17
  • I spoke with a co-worker and he mentioned that you can't multiply booleans with numbers. It made sense but I wasn't sure if that was an ada only restriction, or if most languages don't allow. – onaclov2000 Oct 26 '10 at 13:55
  • Eric: This answer is misleading. The result of a logical/comparison operator in C is **always** 0 or 1. This is well-defined by the standard. You can use other nonzero values with conditionals, but that's quite different from your implication that "true" takes on random nonzero values. – R.. GitHub STOP HELPING ICE Oct 26 '10 at 15:25
  • @R..: Hmm... Now you have me trying to remember in which environment I had been surprised to find true (visibly) implemented as -1. I think I recall the "pun" that both !true==false and ~true==false. – Eric Towers Oct 27 '10 at 03:23
1

If your language allows multiplication between a boolean and a number, then yes, that is faster than a conditional. Conditionals require branching, which can invalidate the CPU's pipeline. Also if the branch is big enough, it can even cause a cache miss in the instructions, though that's unlikely in your small example.

chrisaycock
  • 36,470
  • 14
  • 88
  • 125
  • Interesting, I was thinking about the whole branch thing. I forgot about pipelining (Shame on me, we have been going over it quite a bit in school, I should know better) – onaclov2000 Oct 26 '10 at 13:45
1

Generaly, and particularly when working with Ada, you should not worry about micro-optimization issues like this. Write your code so that it is clear to a reader, and only worry about performance when you have a problem with performance, and have it tracked down to that portion of the code.

Different CPUs have different needs, and they can be insanely complex. For example, in this case which is faster depends a lot on your CPU's pipeline setup, what's in cache at the time, and how its branch prediction unit works. Part of your compiler's job is to be an expert in those things, and it will do a better job than all but the very best assembly programmers. Certianly better than you (or me).

So you just worry about writing good code, and let the compiler worry about making efficient machine code out of it.

T.E.D.
  • 44,016
  • 10
  • 73
  • 134
1

For the problem stated, there is indeed simple expressions in C that may produce efficient code.

The nth power of 2 can be computed with the << operator as 1 << n, provided n is less than the number of value bits in an int.

If blah is a boolean, namely an int with a value of 0 or 1, your code fragment can be written:

number += blah << n;

If blah is any scalar type that can be tested for its truth value as if (blah), the expression is slightly more elaborate:

number += !!blah << n;

which is equivalent to number += (blah != 0) << n;

The test is still present but, for modern architectures, the generated code will not have any jumps, as can be verified online using Godbolt's compiler explorer.

chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • Glad you decided to give this answer. I almost gave the same answer myself a while back, but it was an old question. Somehow it keeps going active though, so there should probably be an optimal answer. – technosaurus Aug 02 '17 at 04:56
0

In either case, you can't avoid a branch (internally), so don't try!

In

number = number + blah*2^n

the full expression will always have to be evaluated, unless the compiler is smart enough to stop when blah is 0. If it is, you'll get a branch if blah is 0. If it's not, you always get an expensive multiply. In case blah is false, you'll also get the unnecessary add and assignment.

In the "if then" statement, the statement will only do the add and assignment when blah is true.

In short, the answer to your question in this case is "yes".

kes
  • 5,983
  • 8
  • 41
  • 69
  • 1
    Why is everyone missing the fact that this multiply is not expensive but virtually free? (Hint: it's a power of 2.) – R.. GitHub STOP HELPING ICE Oct 26 '10 at 15:27
  • Just because it's a power of two does not make it faster than not doing it at all. – kes Oct 26 '10 at 18:17
  • you can avoid the branch it depends on the architecture. you cannot avoid some sort of conditional execution, that is true, unless blah is known to be not just a generic boolean but is specifically a 1 or 0. – old_timer Oct 27 '10 at 13:36
0

This code shows they perform similarly, but multiplication is usually slightly faster.

@Test
public void manual_time_trial()
{
    Date beforeIfElse = new Date();
    if_else_test();
    Date afterIfElse = new Date();
    long ifElseDifference = afterIfElse.getTime() - beforeIfElse.getTime();
    System.out.println("If-Else Diff: " + ifElseDifference);

    Date beforeMultiplication = new Date();
    multiplication_test();
    Date afterMultiplication = new Date();
    long multiplicationDifference = afterMultiplication.getTime() - beforeMultiplication.getTime();
    System.out.println("Mult Diff   : " + multiplicationDifference);

}

private static long loopFor = 100000000000L;
private static short x = 200;
private static short y = 195;
private static int z;

private static void if_else_test()
{
    short diff = (short) (y - x);
    for(long i = 0; i < loopFor; i++)
    {
        if (diff < 0)
        {
            z = -diff;
        }
        else
        {
            z = diff;
        }
    }
}

private static void multiplication_test()
{
    for(long i = 0; i < loopFor; i++)
    {
        short diff = (short) (y - x);
        z = diff * diff;
    }
}
William
  • 491
  • 5
  • 9