-1

Why does my code for overwriting const int variable work? Is it safe?

#include <iostream>
#include <cstring>

using namespace std;

int z = 5;
const int x = *(&z);

int main()
{
    cout << "A:" << x << ", " << &x << endl;
    int y = 7;
    cout << "B:" << y << ", " << &y << endl;
    memcpy((int*)&x, &y, sizeof(int));
    cout << "C:" << x << ", " << &x << endl;
}

Output would be:

A:5, 0x600f94
B:7, 0x7a7efb68019c
C:7, 0x600f94

I am not sure if this has been asked before since I don't know what to search for in this situation.

Nivs
  • 25
  • 5
  • 4
    No it's not safe, the behavior is undefined. Which means it may do anything. “Whatever you expect” being one of the possible outcomes of “anything”, it may happen to do what you expect. This time. Maybe not next time. – spectras Sep 07 '17 at 01:28
  • It works because it is undefined behavior. – Sam Varshavchik Sep 07 '17 at 01:28
  • 3
    `(int*)` casts away the const qualifier. The actual behavior here is undefined. – Chad Sep 07 '17 at 01:30
  • If I initialize const int x = 5; the program crashes at memcpy (expected). Why doesn't this implementation (use of *(&)) crash? – Nivs Sep 07 '17 at 01:31
  • @Nivs Undefined behavior is undefined. – Baum mit Augen Sep 07 '17 at 01:32
  • 2
    Undefined behaviour means the program could look like it works right up until your [bosses demonstrate the program in front of thousands of people at COMDEX](https://www.youtube.com/watch?v=73wMnU7xbwE). – user4581301 Sep 07 '17 at 01:36
  • @nivs> And by the way, if you are curious of what is happening under the hood, it turns out that gcc chooses to put your `x` variable into the `.bss` section when you initialize it this way. That means writing to it will not be detected as an invalid operation by the system. But this cannot be relied upon, it might do something different with another version, or even the same version with different compilation flags. A new version could even introduce some logic to detect this specific mistake and refuse to compile altogether. – spectras Sep 07 '17 at 02:01
  • I would expect `SEGFAULT` with this code... – Michał Fita Sep 07 '17 at 08:36
  • @MichałF> your expectation is incorrect. You have to account for the fact that `z` symbol has default visibility, so it may be interposed by a shared library at dynamic linking. This means that the value to copy into `x` may not be `5`, it could be that of the interposed symbol. Therefore `x` cannot live in the `.rodata` section. This is why it is simply allocated a slot in the `.bss` section, with its value is initialized at program startup. And since it lives in the `.bss` section, which is R/W, it won't segfault when executing the UB-inducing line. – spectras Sep 07 '17 at 14:25
  • @Nivs: You should read about the difference between static and dynamic initialization. – Ben Voigt Sep 08 '17 at 04:40
  • @spectras Makes sense what you're saying. Do you know if the decision about "exposition" is make by the compiler or the linker? – Michał Fita Sep 11 '17 at 12:17
  • 1
    @MichałF> by the developer :). It can be changed using `__attribute__((visibility("hidden")))` on gcc for instance. Or `static` to make it local to the translation unit. Info is encoded in the binary object by the compiler and used both at link edition step and dynamic linking. I don't promise making `x` and `z` hidden or static will be sufficient to make the program segfault. It is necessary though. – spectras Sep 11 '17 at 12:41
  • 1
    @MichałF> it may be of interest to you that when compiling with `clang`, which has a more liberal approach to elf interposition semantics (see [this question](https://stackoverflow.com/q/45971091/3212865) for more in-depth examples), the program does segfault. Analyzing the generated program shows `clang` violates interposition semantics on that case and short-circuits the initialization of `x`. – spectras Sep 11 '17 at 12:54

1 Answers1

2

Answering your questions:

  1. It's not safe; should never be done, it's example of how not to use C.

  2. It's based on undefined behaviour; that means specification doesn't give exact instruction how such attempt should be treated in the code.

  3. Final answer to question why it works? The answer here is true for the GCC only as other compilers may optimise/treat const different way. You need to understand that technically const int x is a declaration of a variable with qualifier. That means (until it's optimised) it has it's place in the memory (in some circumstances in read-only section). When you removed the qualifier and give the address of the variable to the memcpy() (which is dummy library call unaware of memory protection) it makes attempt to write the new data to that address. If compiler puts that variable into read-only section (I faced that epic failure in the past) the execution on any Unix would end with segmentation fault caused by writing instruction violation memory protection of the read-only memory segment used by your program to hold constant data.

In C++ real constants are qualified by constexpr, however there are other implications.

Michał Fita
  • 1,183
  • 1
  • 7
  • 24