0
#include <stdio.h>

int main(){
 int a = -1;
 int b = -1;
 long c = *(long*)&b; // expect `c` to also be -1.
                      // assuming `int` is 4 bytes and `long` is 8 bytes.

 return 0;
}

Does the above code produce undefined behaviour according to the strict aliasing rule?: What is the strict aliasing rule?

Above code runs fine even with compiler optimization on.

I have even used the -fsanitize="undefined" option of gcc but still got no errors at runtime.

Why will the above code cause UB? and how else can I accomplish such thing? is using memcpy and copying the memory content the only option?

SuperStormer
  • 4,997
  • 5
  • 25
  • 35
Dan
  • 2,694
  • 1
  • 6
  • 19
  • 1
    Why do you have to cast using pointers if `long c = b;` works perfectly fine? – SuperStormer May 09 '21 at 03:18
  • because I'm trying to figure why using pointers to the same memory location and dereferencing them causes UB. – Dan May 09 '21 at 03:19
  • @Dan Compiler may assume `a` not used and optimize out. – chux - Reinstate Monica May 09 '21 at 03:24
  • *"Above code runs fine even with compiler optimization on."* That proves nothing. UB cannot be experimentally determined. Code can appear to work correctly even when it has UB. That is the truly nasty part of UB. Often code seems to be working, when in fact it has bugs. Those bugs often manifest at the worst possible time when you make a seemingly inocuous change to unrelated code. – user3386109 May 09 '21 at 03:31
  • [Suggested reading](https://stackoverflow.com/questions/2397984). Summary: the only thing you need to know about UB is how to avoid it. – user3386109 May 09 '21 at 03:38

1 Answers1

3

This is in fact undefined behavior as you are attempting to read an int object through an lvalue of type long. Also note that there is no guarantee that a immediately follows b in memory.

The proper way to have multiple variables share memory is with a union:

union u {
    int x[2];
    long y;
};

In this case you can write to one member of the union and read from another to reinterpret the representation of one to the other.

dbush
  • 205,898
  • 23
  • 218
  • 273
  • but all I'm saying is `hey gcc, interpret this chunk of memory as a long`, why is this undefined? – Dan May 09 '21 at 03:25
  • 1
    @Dan Because the C standard say so. This also means that compilers can assume that programs will not have UB and will perform optimizations based on that fact. – dbush May 09 '21 at 03:26
  • They are saying in this comment section that using unions is also wrong: https://stackoverflow.com/a/98702/14940626 – Dan May 09 '21 at 03:30
  • 1
    @dan: at least in C, given dbush's union declaration, reading or writing `u.x[0]` or `u.y` is fine (unless you're on some mythical machine which has integer trap values). You can use a pointer *to the union*, and it's just the same. As long is the access is visibly through the union, it's not a strict alias violation. At least, that's my current understanding. On the other hand, you need to gave a really good reason for type punning. "I just wanted to see if it works" is not a good reason because experimental evidence cannot prove that there is no UB. – rici May 09 '21 at 03:53
  • 1
    "*In this case you can write to one member of the union and read from another to reinterpret the representation of one to the other*" - **technically**, that is OK only in C but is [illegal in C++](https://stackoverflow.com/questions/11373203/), as you are only allowed to read from the *active* union member that was *last* written to. But **practically**, this will *usually* work fine, at least for trivial types. – Remy Lebeau May 09 '21 at 05:21