2

We are trying to speedup some code under Clang and Visual C++ (GCC and ICC is OK). We thought we could use constexpr to tell Clang a value is a compile time constant but its causing a compile error:

$ clang++ -g2 -O3 -std=c++11 test.cxx -o test.exe
test.cxx:11:46: error: function parameter cannot be constexpr
unsigned int RightRotate(unsigned int value, constexpr unsigned int rotate)
                                             ^
1 error generated.

Here is the reduced case:

$ cat test.cxx
#include <iostream>

unsigned int RightRotate(unsigned int value, constexpr unsigned int rotate);

int main(int argc, char* argv[])
{
  std::cout << "Rotated: " << RightRotate(argc, 2) << std::endl;
  return 0;
}

unsigned int RightRotate(unsigned int value, constexpr unsigned int rotate)
{
  // x = value; y = rotate
  __asm__ ("rorl %1, %0" : "+mq" (value) : "I" ((unsigned char)rotate));
  return value;
}

GCC and ICC will do the right thing. They recognize a value like 2 in the expression RightRotate(argc, 2) cannot change under the laws of the physical universe as we know them, and it will treat 2 as a compile time constant and propagate it into the assembly code.

If we remove the constexpr, then Clang and VC++ aseembles the function into a rotate REGISTER, which is up to 3x slower than a rotate IMMEDIATE.

How do we tell Clang the function parameter rotate is a compile time constant, and it should be assembled into a rotate IMMEDIATE rather than a rotate REGISTER?

jww
  • 97,681
  • 90
  • 411
  • 885
  • 1
    If the value is a compile time constant, then presumably the compiler needs to generate a version of the function for each value of that constant. That sounds to me like a template. – Tim Sep 02 '16 at 03:58
  • @Tim - I tried to parameterize and specialize, but it results in *`function template partial specialization is not allowed`*. What alternate reality or universe is Clang and the C++ committee operating in? – jww Sep 04 '16 at 07:27
  • @Tim - Here's the follow-up question: [Parameterization and and “function template partial specialization is not allowed”](http://stackoverflow.com/q/39314690) – jww Sep 04 '16 at 07:48

1 Answers1

5

You could use non-type template arguments for this:

template <unsigned int rotate> RightRotate(unsigned int value) {
     ...
}

You'd then invoke it as

RightRotate<137>(argument); // rotate is 137 here
templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065
  • Thanks @templatetypedef. We've got some other constraints that are getting in the way of a full blown template at the moment. I've been thinking seriously about adding a new set of functions so that we can use templates to achieve the good code generation.... – jww Sep 02 '16 at 04:04
  • Here's the follow-up question: [Parameterization and and “function template partial specialization is not allowed”](http://stackoverflow.com/q/39314690) – jww Sep 04 '16 at 07:48
  • If RightRotate were a constructor instead, or a function with deduced template arguments (ADL), you can use `template using UIConst = integral_constant;` in calls like `RightRotate(UIConst{})`. (If you want to deduce the type from a constructor of that style (ADL) e.g. `auto rr = RightRotate(UIConst<90>{});` where `rr` is deduced to be an instance of struct `RightRotate<90>` you will unfortunately need a helper function in the style of `make_tuple`.) Hope this helps someone! – John P Jul 29 '19 at 09:59