Why short circuit logical operator is supposed to be faster

Question

This question is not about optimizing code, but its a technical question about performance difference of short circuit logical operators and normal logical operators which may go down to how they are performed on hardware level.

Basically logical AND and OR takes one cycle whereas short circuit evaluation uses branching and can take various amount of cycles. Now I know that branch predictors can make this evaluation efficient but I don't see how its faster than 1 cycle?

Yes, if right operand is something expensive then trying to not evaluate it is beneficial. but for simple conditions like X & (Y | Z), assuming these are atomic variables, non short circuit logical operators would perform likely faster. Am I right?

I assumed that short circuit logical operators use branching (no official source, just self thought), because how else you make those jumps while executing instructions in order?

This is not a new thought of course, there are several related answers that actually *are* related in the sidebar, and [compilers know about it too](https://godbolt.org/g/zU34MZ) - the short-circuiting `&&` there is actually implemented without short-circuiting. — harold, Dec 26 '17 at 20:18
@harold good to know. so my assumptions were right. good that compilers can take care of it. — M.kazem Akhgary, Dec 26 '17 at 20:27
and and or operations take more than one cycle, the pipeline attempts to average everything to one cycle. Likewise branches and branch prediction, prefetch buffers, etc attempt to get the data required close to the pipe to avoid noticable stalls. — old_timer, Dec 27 '17 at 00:18
you tagged assembly language what instruction set are you after? — old_timer, Dec 27 '17 at 00:18

score 0 · Answer 1 · edited Dec 13 '20 at 04:07

This is very late but since this hasn't been answered yet (...), I'm going to have a go at it.

You already pointed out the branch prediction, which is inherently true. There are also other hardware related issues on modern hardware, which are mostly related to instruction level parallelism and operational interdependencies.

A short circuit operator requires A and THEN B to be evaluated and B not to be evaluated in case a is false. This leads us back to branches and CPU pipeline flushes due to speculative execution. This can get/gets more costly the more conditions need to be checked in succession. On the other hand, this can get cheaper with non-short circuit operations, since CPUs can evalutate "many" instructions in the same clock cycle, thanks to multiple physical ALUs/FPUs/AGUs etc. being present.

And to drive this point home lets look at the simplest case in Assembly:

a && b: 

cmp    a, 0
jne    LABEL_A
---more code---
LABEL_A:
cmp    b, 0
jne    RETURN_LABEL
 ---more code---

as opposed to... (assuming instructions like setb were used to clamp to [0, 1])

a & b 

and   a, a, b
cmp   a, 0
jne   RETURN_LABEL
---more code---

This should be self-evident in the resulting assembly itself. But yes, you are right in saying that you should definitely use short-circuiting to avoid expensive calculation B in case A is false. But even then the CPU might speculatively execute the test for B anyway. So basically, very simply said, you can "only make things worse by using short circuiting operators(sic!!!!!)".

Why short circuit logical operator is supposed to be faster

1 Answers1

Linked