2

I have some strange behaviour from my compiler. If a particular string already has been defined, then if I define another variable with the same string value, it actually gets pointed to the original instance. e.g.

const char * a = "dog";
const char * b = "dog";
printf("a = %p\n",a);
printf("b = %p\n",b);

Example output is

a = 00404060
b = 00404060

It is not wrong I guess but I wonder if is compiler specific or if there are any rules about this. The question arose because someone working for me wrote a fucntion that was supposed to compare and count string matches. In fact they compared and counted pointer values yet were still getting the correct answer.

I'm using gcc (or g++) 4.6.3

3 Answers3

3

It is compiler specific, the C standard says it's unspecified:

C11 §6.4.5 String literals

It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.

Community
  • 1
  • 1
Yu Hao
  • 119,891
  • 44
  • 235
  • 294
1

It's normal behavior. first and second pointer points to the same string. Those strings cannot be changed, so there is no reason for compiler to allocate 2 different places for the same memory, that's why a and b points to the same place :)

I don't think it's described so in the standard, although all the compilers (modern ones behaves this way)

DawidPi
  • 2,285
  • 2
  • 19
  • 41
1

It is very much allowed to do that, it's even allowed to do it in cases like:

const char *good = "initialised";
const char *bad = "uninitialised";

where good may simply point to the first i in bad. It's basically allowing for space optimisation.

The ISO C standard states, in C11 6.4.5 String literals /7, when discussing the arrays that are created for string literals:

It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.

Note the two slightly similar terms there, unspecified and undefined. The first simply means that the standard imposes no requirements on the question in point, an implementation is free to go one way or the other.

The latter is more serious, undefined behaviour is generally to be avoided if you like portable code.

paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953