59

According to the last meeting of the ISO C++ Committee, bit-cast will be introduced in C++20 standard.

I know that reinterpret_cast is not suitable for this job due to type aliasing rules but my question is why did they choose not to extend the reinterpret_cast to treat the object like it bit sequence representation and preferred to give this functionality as a new language construct?

Tejendra
  • 1,874
  • 1
  • 20
  • 32
bogdan tudose
  • 1,064
  • 1
  • 9
  • 21

2 Answers2

50

Well, there is one obvious reason: because it wouldn't do everything that bit_cast does. Even in the C++20 world where we can allocate memory at compile time, reinterpret_cast is forbidden in constexpr functions. One of the explicit goals of bit_cast is to be able to do these sorts of things at compile-time:

Furthermore, it is currently impossible to implement a constexpr bit-cast function, as memcpy itself isn’t constexpr. Marking the proposed function as constexpr doesn’t require or prevent memcpy from becoming constexpr, but requires compiler support. This leaves implementations free to use their own internal solution (e.g. LLVM has a bitcast opcode).

Now, you could say that you could just extend this specific usage of reinterpret_cast to constexpr contexts. But that makes the rules complicated. Instead of simply knowing that reinterpret_cast can't be used in constexpr code period, you have to remember the specific forms of reinterpret_cast that can't be used.

Also, there are practical concerns. Even if you wanted to go the reinterpret_cast route, std::bit_cast is a library function. And it's always easier to get a library feature through the committee than a language feature, even if it would receive some compiler support.

Then there's the more subjective stuff. reinterpret_cast is generally considered an inherently dangerous operation, indicative of "cheating" the type system in some way. By contrast, bit_cast is not. It is generating a new object as if by copying its value representation from an existing one. It's a low-level tool, but it's not a tool that messes with the type system. So it would be strange to spell a "safe" operation the same way you spell a "dangerous" one.

Indeed, if you did spell them the same way, it starts raising questions as to why this is reasonably well-defined:

float f = 20.4f;
int i = reinterpret_cast<int>(f);

But this is somehow bad:

float f = 20.4f;
int &i = reinterpret_cast<int &>(f);

And sure, a language lawyer or someone familiar with the strict aliasing rule would understand why the latter is bad. But for the lay person, if it is fine to use reinterpret_cast to do a bit-conversion, it is unclear why it is wrong to use reinterpret_cast to convert pointers/references and interpret an existing object as a converted type.

Different tools should be spelled differently.

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
  • 1
    The second of your latter forms should be defined in cases where all use of `i` occurs before the next time that storage is accessed or addressed in conflicting fashion (writes conflict with reads or writes; reads do not conflict with reads) via means not derived from `i`, or execution reaches the start of a function or bona fide loop body wherein that will occur. The "Strict Aliasing" rule was intended to avoid requiring compilers to be pessimistic about things they can't see--not to encourage them to ignore things they can. – supercat May 23 '19 at 14:42
  • Regarding *"`reinterpret_cast` is generally considered an inherently dangerous operation"*... I think this doesn't account for what's probably the most common use case I see for `reinterpret_cast`, which is something like `reinterpret_cast(ptr)`. That's no more dangerous or "cheating" of the type system than `static_cast(static_cast(ptr))`. – user541686 Nov 21 '21 at 09:13
  • @user541686: That isn't allowed in constexpr code either. You can `constexpr` `bit_cast` to a byte array though. And no, that's not more dangerous; it's just longer. – Nicol Bolas Nov 21 '21 at 14:17
-8

There is a fundamental mismatch between the high level language nature of modern, strict interpretation of the C and C++ language standards by compilers and the notion that you can use reinterpret_cast to reinterpret a bunch of bytes as another objects. Note that the so called "strict aliasing" rule in most cases cannot even be used to disqualify any attempt at reinterpreting bytes as the code wouldn't have defined behavior in the first place: reinterpret_cast<float*>(&Int) isn't even a pointer to a float object, it's a pointer to an integer that happens to have the wrong type. You can't dereference it as there is no float object created at that place; if there was one, its lifetime wouldn't have started; and if its lifetime had started, it would be uninitialized.

Bytes that happen to represent a valid float bit pattern just can't be interpreted as such if you don't have a proper float object here.

A valid non null pointer isn't just a typed value of a start address of an area that happens to be properly aligned; a non null valid pointer points to a particular object (or one past the end of an array or a trivial "array" of one object).

I don't even see the "strict aliasing" sanctioned scalar reinterpretation casts (signed/unsigned mix) as possibly valid, as non signed (resp. unsigned) integer object exists at that address (and the compiler obviously cannot use the unsigned (resp. signed) original value either).

Either way, C++ has a broken design because it's a mix of different languages (some very low level some very high level) and is badly broken.

curiousguy
  • 8,038
  • 2
  • 40
  • 58
  • 6
    "*the so called "strict aliasing" rule in most cases cannot even be used to disqualify any attempt at reinterpreting bytes as the code wouldn't have defined behavior in the first place*" ... what? It's the strict-aliasing rule that *causes* that code to be UB. Without that rule, the standard would be incomplete, because there wouldn't be a statement about the behavior of what that would be. There's a difference between "this is UB" and "the standard is underspecified in this area." – Nicol Bolas May 23 '19 at 15:10
  • @NicolBolas Who said the std needed to be complete? How is it complete now? What does it mean to dereference a reinterpret_casted pointer in any case? If it's OK to cast to `char*` why not to `short*` or `float*`? If it's OK to use traditionally `memcpy` and recently `bit_cast` why isn't it OK to dereference a casted pointer? – curiousguy May 23 '19 at 21:03
  • @NicolBolas In C, where is the behavior of `memcpy` by a `FILE` over another one defined? Are all calls to `memcpy` either fully defined or explicitly made UB? – curiousguy May 24 '19 at 19:14
  • ”`reinterpret_cast(&Int)` isn't even a pointer to a float object, it's a pointer to an integer that happens to have the wrong type. You can't dereference it as there is no float object created at that place”, despite all the downvotes this was the answer that I actually understand, thank you, but I disagree with rest of statements you made – 0xB00B Dec 27 '21 at 02:28
  • U do be getting destroyed on this vote. Sorry m8 – Hunter Kohler Apr 05 '22 at 11:52