3

I want to write a function evaluated at compile time, it takes a pointer to 4 bytes array, and outputs an int that has the same bit pattern as that array. So I came up with:

constexpr int f(const char* p) {
     return *reinterpret_cast<int*>(p);
}

Then, I want to use f() like this:

switch(x) {
case f("GOOG"):
   // do something
case f("MSFT"):
   // do something
case f("NIKE"):
  // do something
}

However, I got a compiler error:

error: accessing value of ‘"GOOG"’ through a ‘int’ glvalue in a constant expression case f("GOOG")
  1. How to fix f() so it compiles?
  2. Is there a better way to accomplish the same goal?
ijklr
  • 145
  • 8
  • 3
    Even if it compiled, it would be a strict aliasing violation and UB. Use bit-shifts to make the integer from individual `char`s. – HolyBlackCat Sep 24 '20 at 06:53
  • @HolyBlackCat Thanks. btw what is UB? – ijklr Sep 24 '20 at 06:55
  • UB is undefined behavior. – Louis Go Sep 24 '20 at 06:56
  • Thanks. I feel that doing bit shifting 4 times is not as elegant as just treat it as an int. Is there no way of doing that safely? – ijklr Sep 24 '20 at 06:59
  • You can also create an `int` and `memcpy` the array into it, but `memcpy` is not `constexpr`. – HolyBlackCat Sep 24 '20 at 07:01
  • Bit shifting is one of the clean and standard conform ways to do it, not to mention that it provides the same output for same input for all kinds of endianess - even the exotic ones (which are admittedly rarely seen in daily business). (I consider _elegant_ what is easy to understand and works reliably and robust now and then. Cast-magic is neither-nor.) ;-) – Scheff's Cat Sep 24 '20 at 07:01
  • C++20 adds `std::bit_cast` for this purpose (although it doesn’t work directly with string pointers like this). – Davis Herring Sep 24 '20 at 13:17

1 Answers1

3

Congratulations, you have activated the strict aliasing trap card and your code has undefined behaviour (if it would compile).

There are few errors in your code, the "correct" version is:

 constexpr int f(const char* p) {
         return *reinterpret_cast<const int*>(p);
    }
  • reinterpret_cast cannot cast away const.
  • cursor->p typo?

But since const char* does not point to an int, casting to it breaks the strict aliasing rule. int is not one of the types that can alias others - only std::byte, (unsigned) char can.

The cleanest would be this:

#include <cstring>

constexpr int f(const char* p) {
         int val = 0;
         static_assert(sizeof(val)==4); // If the array is 4-byte long.
         std::memcpy(&val,p,sizeof val);
         return val;
    }

But std::memcpy is not constexpr, even at run-time this will probably not have any overhead, compiler can recognize this and reinterpret the bytes on its own.

So go with bit-shifting:

constexpr int f(const char* p) {
       int value=0;
       using T = decltype (value);
       for(std::size_t i =0; i< sizeof(T);++i)
        value|= (T)p[i] << (8*i);

    return value;
    }

int main(){

    // @ == 64
    // 1077952576 = 01000000 01000000 01000000 01000000
    static_assert(f("@@@@") ==1077952576);
}

Just to be pedantic "@@@@" has length 5, not 4.

Quimby
  • 17,735
  • 4
  • 35
  • 55
  • 1
    cursor was typo. fixed. TY. – ijklr Sep 24 '20 at 07:16
  • 1
    Can you explain what this `value|= (T)*p << (8*i);` is doing? It looks like it only use the first char of p, shift left by 1 byte, and doing an bitwise or with value. ie. is it using the rest of the chars? p[1], p[2] etc. – ijklr Sep 24 '20 at 07:26
  • @ijklr Oops, sorry, of course it is missing the index, silly me for making a too easy test case. Fixed. – Quimby Sep 24 '20 at 07:28
  • thanks. and the reason you used `(T)` instead of `reinterpret_cast` is because inside `constexpr` functions, you can't reinterpret_cast, right? – ijklr Sep 24 '20 at 07:32
  • I wish I can just cast it to int and don't need to loop :( – ijklr Sep 24 '20 at 07:33
  • @ijklr I get it, but lookup "strict aliasing rule", it exists for a reason - almost any code with pointers would be slower and that is just not C++ way of doing things. Also, its `constexpr`, the loop won't be there. – Quimby Sep 24 '20 at 07:35
  • @ijklr: You cannot do a `reinterpret_cast` *or any equivalent* C-style cast. If you were going to be allowed to convert pointers like that, C++ wouldn't forbid `reinterpret_cast`. – Nicol Bolas Sep 24 '20 at 13:27
  • @ijklr Missed that comment. I use `(T)` because I was lazy writing `static_cast` and it is reasonably safe for integral types. But it is necessary if you use e.g. `long long value`. `p[i]` only gets promoted to `int` and it would be shifted out for larger `i`. This will cast it to the result type before shifting. – Quimby Sep 24 '20 at 17:52