3

from the book Stroustrup - Programming: Principles and practices using C++. In §17.3, about Memory, addresses and pointers, it is supposed to be allowed to assign a char* to int*:

char ch1 = 'a';
char ch2 = 'b';
char ch3 = 'c';
char ch4 = 'd';
int* pi = &ch3;   // point to ch3, a char-size piece of memory
*pi = 12345;      // write to an int-size piece of memory
*pi = 67890;

graphically we have something like this:

char address to int pointer

quoting from the source:

Had the compiler allowed the code, we would have been writing 12345 to the memory starting at &ch3. That would definitely have changed the value of some nearby memory, such as ch2 or ch4, or we would have overwritten part of pi itself.

In that case, the next assignment *pi = 67890 would place 67890 in some completely different part of memory.


I don't understand, why the next assignment would place it: in some completely different part of memory? The address stored in int *pi is still &ch3, so that assignment would be overwrite the content at that address, i.e. 12345. Why it isn't so?

Please, can you help me? Many thanks!

François Andrieux
  • 28,148
  • 6
  • 56
  • 87
JB-Franco
  • 256
  • 1
  • 3
  • 17
  • 1
    `&ch3` results in a `char*` not a `int*` so that assignment is illegal without for example a `reinterpret_cast` which I wouldn't do. But I think their point is that if `sizeof(int) > sizeof(char)` then writing to `pi` will write past the allocated memory of `ch3`, all the way up to the first few bytes of `pi` itself (see the diagram. Therefore the next derefence will be some other address since `pi` pointer itself was modified accidentally. – Cory Kramer Aug 13 '20 at 17:26
  • This is simply an example of what happens when [demons fly out of your nose](https://en.wiktionary.org/wiki/nasal_demon). The first assignment clobbers (partially) the value of `pi` itself. – Sam Varshavchik Aug 13 '20 at 17:28
  • `reinterpret_cast` is a pathway to many abilities some consider to be... unnatural – user4581301 Aug 13 '20 at 17:28
  • @FrançoisAndrieux the book uses C++11 and C++14. – JB-Franco Aug 13 '20 at 19:06
  • @JB-Franco I've removed the C tag then. – François Andrieux Aug 13 '20 at 19:52

4 Answers4

3
char ch3 = 'c';
int* pi = &ch3;

it is supposed to be allowed to assign a char* to int*:

Not quite - there is an alignment concern. It is undefined behavior (UB) when

If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined. C17dr § 6.3.2.3 7

Example: Some processor require int * to be even and if &ch3 was odd, storing the address might fail, and de-referencing the address certainly fails: bus error.


The next is certainly UB as the destination is outside the memory of ch3.
ch1, ch2, ch4 might be nearby and provide some reasonable undefined behavior, but the result is UB.

// undefined behavior
*pi = 12345;      // write to an int-size piece of memory`

When code attempts to write outside its bounds - it is UB, anything may happen, including writing into neighboring data.

The address stored in int *pi is still &ch3

Maybe, maybe not. UB has occurred.

why the next assignment would place it: in some completely different part of memory?

The abusive code suggests that pi itself is overwritten by *pi = 12345;. This might happen, it might not. It is UB. A subsequent use of *pi is simply more UB.


Recall with UB you might get what you hope for, you might not - it is not defined by C.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
2

You seem to have skipped part of the explanation you quoted:

or we would have overwritten part of pi itself

Think of it this way, since ints are larger than chars, if an int* points to an address location that stores a char, there will be overflowing memory when you attempt to assign an integer value to that location, as you only have a single byte of memory allocated but are assigning 4 bytes worth of data. i.e. you cannot fit 4 bytes of data into one, so the other 3 bytes will go somewhere.

Assume then that the overflowing bytes partially change the value stored in pi. Now the next assignment will go to a random memory location.

Makogan
  • 8,208
  • 7
  • 44
  • 112
  • > Assume then that the overflowing bytes partially change the value stored in ```pi```. - Therefore the address stored within, will be corrupted. > Now the next assignment will go to a random memory location. - If the ```*pi``` can't resolve the above address because it is corrupted, the content will be inserted in a new address provided by the compiler. Is it correct? – JB-Franco Aug 13 '20 at 17:46
  • 2
    No it is not. Addresses are just numbers, so as an analogy let's say the original address in pi was 125 smith street. Let's say the overflow corrupts pi, thus changing it to say 349 smith street. If you go to the new address, you may end up anywhere, including outside of the city. In actual terms assume `pi = 8864`, the overflow corrupts that value, changing it to some random value, say `pi = 4264`. Then the next time you are writing to pi you are writing to the modified address, which is a random value. – Makogan Aug 13 '20 at 17:51
  • @JB-Franco "Now the next assignment will go to a random memory location. " is not specified by C. The compiler need not re-read `pi` and instead use the same pointer value. The there is much undefined behavior here. – chux - Reinstate Monica Aug 13 '20 at 20:35
  • Take into account the reply must be framed by what the texts in OP's answer say. The question is how or why it will go to a random location, so we must assume this is a case where the compiler does reread pi. – Makogan Aug 13 '20 at 20:41
2

Let's assume the memory address layout is:

0 1 2 3 4 5 6 7

From the left 0, 1, 2 and 3 are characters. From the right 4, 5, 6 and 7 are an int*. The values in each byte in hex may be:

61 62 63 64 02 00 00 00

Note how the first four are ascii values and the last four are the address of ch3. Writing *pi = 12345; Changes the values like so:

61 62 39 30 00 00 00 00

With 0x39300000 being 12345 in little endian hexadecimal.

The next write *pi = 67890; would start from memory adress 00 00 00 00 not 02 00 00 00 as one could expect.

2

Firstly, you have to understand that everything is a number i.e., a char, int, int* all contain numbers. Memory addresses are also numbers. Let's assume the current example compiles and we have memory like following:

--------------------------
Address | Variable | Value
--------------------------
0x01   |    ch1       a
0x02   |    ch2       b
0x03   |    ch3       c
0x04   |    ch4       d
0x05   |    pi        &ch3 = 0x03

Now let's dereference pi and reassign a new value to ch3:

*pi = 12345;

Let's assume int is 4 bytes. Since pi is an int pointer, it will write a 4 byte value to the location pointed by pi. Now, char can only contain one byte values so, what would happen if we try to write 4 bytes to that location? Strictly speaking, this is undefined behaviour but I will try to explain what the author means.

Since char cannot contain values larger than 1 byte, *pi = 12345 will overflow ch3. When this overflow happens, the remaining 3 bytes out of the 4 bytes may get written in the memory location nearby. What memory locations do we have nearby? ch4 and pi itself! ch4 can only contain 1 byte as well, that leaves us with 2 bytes and the next location is pi itself. Meaning pi will overwrite it's own value!

--------------------------
Address | Variable | Value
--------------------------
0x01   |    ch1       a
0x02   |    ch2       b
0x03   |    ch3       12  //12 ended up here
0x04   |    ch4       34  //34 ended up here
0x05   |    pi        &ch3 = 0x03 // 5 gets written here

As you can see that pi is now pointing to some other memory address which is definitely not ch3.

Waqar
  • 8,558
  • 4
  • 35
  • 43
  • Good explanation, but one minor point -- `*pi = 12345` is assigning a decimal value, and you are breaking it into `12`, `34`, `5` as if it were hex – Human-Compiler Aug 13 '20 at 19:05
  • @Human-Compiler I know, but just making a point and being lazy lol. – Waqar Aug 13 '20 at 19:09