18

Consider the following snippet:

static constexpr uint8_t a = 0;
static constexpr const int8_t *b = reinterpret_cast<const int8_t *>(&a);

This fails to compile with error: a reinterpret_cast is not a constant expression, because the C++ standard forbids using reinterpret_cast in constexpr.

However compilation succeeds if I want to store the value b in PROGMEM (for AVR microcontrollers):

static constexpr uint8_t a = 0;
static const int8_t PROGMEM *const b = reinterpret_cast<const int8_t *>(&a);

In this case the compiler is able to prove that the expression reinterpret_cast<const int8_t *>(&a) is compile-time constant, since it inserts its result (an address pointing to some byte containing a zero) into program space in the binary:

_ZL1g:
  .zero   1
  .section        .progmem.data,"a",@progbits
  .type   _ZL1b, @object
  .size   _ZL1b, 2
_ZL1b:
  .word   _ZL1g

Also, my understanding is that reinterpret_cast is a compile-time directive. So how come it can't be used inside a constexpr?

Conor Taylor
  • 2,998
  • 7
  • 37
  • 69

2 Answers2

15

At runtime the C++ language has the concept of Undefined Behavior. Under certain (well specified) conditions, the program has Undefined Behavior, that means that it can exhibit any behavior: it can crash, it can hang forever, it can print gibberish, it can appear to work, or it can do anything. A simplified explanation of why this exists is performance.

At runtime this is a tradeoff (a compromise if you will), but it is unacceptable at compile time. If the standard would allow UB at compile time, not only it would be legal to get crashes while compiling the program or compile ad infinitum, but you could never be sure of the validity of the compiled executable.

As such, any form of constexpr would have to be 100% free of Undefined Behavior. No exceptions about it. No leeway.

One notorious source of UB is reinterpret_cast. There are very few valid uses of reinterpret_cast, most of them result in UB. Plus it is practically impossible to check if the use is valid. So reinterpret_cast is not allowed during compilation, i.e. it is not allowed in constexpr.

bolov
  • 72,283
  • 15
  • 145
  • 224
  • Is it really impossible for the compiler to check if the `reinterpret_cast` of compile time *stuff* would result in UB? – Zereges Jan 25 '20 at 21:34
  • @Zereges yes. It would have to keep track of all the memory with the types of all the objects living in memory at every point in the execution. – bolov Jan 25 '20 at 21:37
  • 1
    The fact that at runtime `reinterpret_cast` may trigger UB does not mean a subset of uses couldn't be allowed at compile-time. – Acorn Jan 25 '20 at 21:39
  • @Acorn `reinterpret_cast` is very nasty. `reinterprect_cast(adr)` you don't know if it is valid unless you know what kind of object actually lives in memory at `adr` – bolov Jan 25 '20 at 21:41
  • @bolov That is precisely the case OP is asking about: `adr` is fixed. – Acorn Jan 25 '20 at 21:42
  • 1
    @Acorn it would complicate the language too much to allow some uses and those uses would anyway be just a subset of all the valid cases. The committee decided it against it and instead went for constexpr containers instead to solve almost all of the problems that needed reinterpret_cast – bolov Jan 25 '20 at 21:45
  • @bolov Cases like OP describes would be easy to allow. There have been more complex things that have been progressively enabled under `constexpr` contexts over the years. I would be way more concerned about users complaining about strict aliasing issues if we allowed it. – Acorn Jan 25 '20 at 21:53
  • 1
    @Acorn: "*Cases like OP describes would be easy to allow.*" If you are not going to be the one to implement them in the compiler, then I would not be so quick to say that any such thing would be "easy". Doing the equivalent without using `reinterpret_cast` is simple enough, and it is much more readable. You static_cast the value as needed between signed and unsigned. That way, it's 100% clear to everyone when you're transferring the value from one type to the other, and the implementation doesn't have to constantly deal with the question of which interpretation should the compiler use. – Nicol Bolas Jan 25 '20 at 23:00
  • 1
    @NicolBolas _"Doing the equivalent without using `reinterpret_cast` is simple enough"_ actually it's impossible without making a copy of the data, since `error: static_cast from 'const uint8_t *' to 'const int8_t *' is not allowed`. And `static_cast` produces an rvalue which can't be converted to a pointer value without storing first. – Conor Taylor Jan 25 '20 at 23:10
  • 1
    @ConorTaylor: "*without making a copy of the data*" Who decided that was a *necessary* limitation to whatever you're doing? It's a byte; I think you can afford to have a copy of a byte. My point is that you can accomplish the same goal more directly: when you want to access the value in a different way, just copy it. – Nicol Bolas Jan 25 '20 at 23:22
  • 1
    @NicolBolas the snippets I posted are simplifications of the problem, the actual objects are much larger than a byte. And it's intended for an embedded environment. So it's a necessary limitation for my case, but not for the constexpr reinterpret_cast issue in general, where you can of course static_cast and store in an lvalue. – Conor Taylor Jan 25 '20 at 23:24
  • @ConorTaylor: "*the snippets I posted are simplifications of the problem*" That changes the nature of the problem, because the only reason your code is well-defined is because you're dealing with a pointer to a fundamental type that the standard specifically says you can do this kind of signed-to-unsigned reinterpretation on. If you're using some user-defined type, then this would be UB and therefore il-formed at compile time regardless of whether `reinterpret_cast` would be allowed. – Nicol Bolas Jan 25 '20 at 23:26
  • @bolov is it the same case as static_cast, const_cast , etc ? –  Jan 26 '20 at 04:25
  • @NicolBolas First of all, I have not talked about compiler implementations, but yes, it would be quite easy given the state of `constexpr` in the major compilers. Second, I don't know what you mean about "*the implementation doesn't have to constantly deal with the question of which interpretation should the compiler use*". The interpretation would be defined, no "questions" about it. Third, in embedded like AVR, you *really* care about every single instruction and byte since memory sizes are measured in ***bytes*** for some of them. Finally, I find your tone unnecessarily aggressive. – Acorn Jan 26 '20 at 14:19
  • 1
    @Acorn: "*First of all, I have not talked about compiler implementations*" You did by implication. What gets allowed by the standard for `constexpr` is largely based on what can be implemented by compilers. We didn't get transient memory allocation in C++20 largely because implementers weren't sure how to implement it. – Nicol Bolas Jan 26 '20 at 14:33
  • @Acorn: "*The interpretation would be defined, no "questions" about it.*" Constexpr code must verify that UB isn't happening. To make reinterpret_cast work, the compiler's faux runtime must check, at ever dereference, if the object being pointed to is the right type. And it must do aliasing conversions like signed to unsigned for each read/write (constexpr runtime is not like regular runtime assembly). as needed. As for embedded, code that gets executed at compile time doesn't contribute to any memory issues at runtime so it's irrelevant. – Nicol Bolas Jan 26 '20 at 14:35
  • "*What gets allowed by the standard for constexpr is largely based on what can be implemented by compilers* If that were true, we would have already in C++11 what we have in C++20. We don't because it is way better to work piece by piece. And the case it is being discussed here is way easier than other things that have been allowed. – Acorn Jan 26 '20 at 23:51
  • "*Constexpr code must verify that UB isn't happening.*" There is no UB for something we have not allowed, though. "*the compiler's faux runtime must check, at ever dereference, [...]*" I think you are discussing something we are not. From my point of view, no one has proposed to allow every single `reinterpret_cast`, so the rest of your argument does not apply. – Acorn Jan 27 '20 at 00:02
  • "*As for embedded, code that gets executed at compile time doesn't contribute to any memory issues at runtime so it's irrelevant.*" We are against having to make an unnecessary copy at runtime, which is what you claimed that we could afford. Yes, in many cases it does not matter, but in many others it does, specially in embedded and when using some memory maps. – Acorn Jan 27 '20 at 00:04
  • Unlucky wording: *'There are very few valid uses [...], most of them [the valid uses???] result in UB.'* My proposition (try, at least...): *'Only very few uses of [...] are valid, most of them [...].'* – Aconcagua Jul 19 '22 at 18:28
2

So how come it can't be used inside a constexpr?

Simply because the standard does not allow it. constexpr has been a feature that has been ever expanding since C++11 over the different standards, so it is natural to think that a subset of reinterpret_cast uses could work.

The question is whether allowing it would be actually useful or actively harmful. There are very few good uses of reinterpret_cast, specially if you program and compile your code assuming the strict aliasing rule holds: it would be easy to create pointers that break it.

On the other hand, it is clear that, for embedded users and specialized compilers/flags/environments, it could be useful to some degree.

Acorn
  • 24,970
  • 5
  • 40
  • 69