0

An algorithm I am working on must frequently check whether some arbitrary integer value 'x' is less-than, greater-than, or equal to another arbitrary integer value 'y'. The language I am implementing it in is C.

A naive way of doing it would be to use if-then-else branching to check this, but that would not work optimally because the processor's branch predictor would mess up. I am trying to implement this comparison only using arithmetic / logical evaluations as well as bitwise operations but, honestly, my brain is stuck right now.

I will call the function f(x, y). The function will return 1, if x < y; 2, if x == y; or 3, if x > y.

One of my ideas I have had was to evaluate:

x = 3 * (x > y)

which will return 3 when x > y, and 0 otherwise. There could be an operation, which returns either 1 or 2, if x == 0 using some bitwise operators and either condition x == y or x < y, but I have not found any such combinations of operations to achieve what I need.

Finally, I am looking for any function f(x, y) which will give me my results with the least number of operations possible, be it with or without bithacks; it just needs to be fast. So if you have any other ideas I may not have considered, pointing me to another solution is also greatly appreciated.

univise
  • 509
  • 2
  • 11
  • 10
    Your compiler will likely have to emit comparison instructions and branches anyway to get the correct result of the comparison. Look at the assembler code before trying to prematurely optimize. If speed is that much of an issue, you might use a completely different approach parallelizing the evaluation, etc. But that masively depends on you architecture, compiler and own capabilities. – too honest for this site Aug 23 '15 at 18:57
  • 2
    What is the domain for your _x_ and _y_? Are they integers, Can they be negative? Can they have different sign? – higuaro Aug 23 '15 at 19:01
  • 3
    Well, there is conditional moves on most major processors, so... unless the result is immediately needed, it's a more or less free operation. Of course if you have a dependency chain, then both the compare and the move will kick in (but still, that's like 5-6 cycles...). – Damon Aug 23 '15 at 19:01
  • You could just return `x-y` and test the sign of the function value, although a) that depends on the subtraction not being out of range, and b) you just push the comparison and branching further back. – Weather Vane Aug 23 '15 at 19:03
  • Have you profiled enough to determine that this function/algorithm is a bottleneck for your application performance? – higuaro Aug 23 '15 at 19:04
  • 2
    Do you really think that calling a function to compare two integers can be more efficient than the optimisations that the finest brains can work into CPUs and compilers? – Weather Vane Aug 23 '15 at 19:08
  • Are you into a code golf competition? – higuaro Aug 23 '15 at 19:09
  • @Damon: I would call ARM-Cortex-M, AVR, PIC, MSP430 very well "major processors". And ARMv7-A still needs comparison operators and the conditional operations still need extra-cycles. And IIRC, ARMv8 also does not have a full set of conditional execution anymore. – too honest for this site Aug 23 '15 at 19:18
  • There are a lot of similar questions on SO. E.g https://stackoverflow.com/questions/1741010/c-program-to-compare-integers-without-using-logical-operators/1741104#1741104 Did you check for these? – Jens Gustedt Aug 23 '15 at 19:53
  • 1
    The guys writing your compiler have already studied every bithacks trick there is, and then some. If there is a tricky way of doing things, the compiler knows that. But if you obfuscate the code enough, you might end up with something the compiler *doesn't* understand. – Bo Persson Aug 23 '15 at 20:09
  • @univise The specification is a bit vague. It is likely that the accepted answer to [this question](http://stackoverflow.com/questions/14579920/fast-sign-of-integer-in-c) contains the solution. Simply compute `a-b` first, pass the result to the signum function, then add 2 to the result of the signum function to map to the desired output values. Obviously you would want to inline all code. Note: this may not be faster than writing the code in natural fashion and letting the compiler take care of branch removal. – njuffa Aug 23 '15 at 22:23

3 Answers3

4

The following expression will do what you want.

1 + (x >= y) + (x > y)

On x86-64 this compiles to a fairly-efficient code using SETcc instead of branches:

compare(int, int):
    xorl    %edx, %edx
    cmpl    %esi, %edi
    setg    %al
    setge   %dl
    movzbl  %al, %eax
    leal    1(%rdx,%rax), %eax
    ret

On ARM:

compare(int, int):
    cmp r0, r1
    ite lt
    movlt   r0, #1
    movge   r0, #2
    it  gt
    addgt   r0, r0, #1
    bx  lr
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Steve Emmerson
  • 7,702
  • 5
  • 33
  • 59
2

Simply subtract the 2 variables x and y.

You'll get:

  1. if x<y result is res<0
  2. if x>y result is res>0
  3. if x==y result is res==0.

Implement it in macro

#define Chk(x, y) ((x)-(y))

Another advantage is that you can simply use the ! operator to check for equality or disequality:

if (!Chk(x, y))
{
    // x == y
}
else
{
    // x != y
}

P.S. this is the same result that comes from many standard functions as strcmp().

P.P.S. Please consider that processors machine instruction cmp, at least for all CPU types I know, executes a subtraction between the two operands and set the flags to reflect the result. Even the just comparing two values in C produce code that have a cmp instruction and some branch like jz, jl, etc.

Just storing the difference of the values, a single value, permit you to keep an information, even for later evaluation, holding all elements you may need.

Frankie_C
  • 4,764
  • 1
  • 13
  • 30
  • Unlike simply using comparison operators in the natural way, this subtraction can overflow leading to incorrect results. Checking for overflow is going to slow it down a lot. Fortunately, that's not necessary because you can just use comparison operators, and most likely get optimum speed, too. – rici Aug 24 '15 at 03:00
  • 1
    Thank you for your efforts but this solution uses branching – univise Aug 24 '15 at 15:37
1

One option is:

int f(int x,int y)
{
    return ((x-y)>>31)-((y-x)>>31) + 2;
}


int main(int argc, char *argv[])
{
    int x,y;
    for(x=-3;x<=3;x++)
    for(y=-3;y<=3;y++)
        printf("x=%d y=%d f(x,y)=%d\n",x,y,f(x,y));
    return 0;
}

This relies on the int type being a 32bit quantity.

You may also want to look into SIMD instructions (e.g. SSE on x86 or Neon on Arm) as these may help you accelerate your code.

Peter de Rivaz
  • 33,126
  • 4
  • 46
  • 75