3

I was under the impression that comparison operators are not defined for C-style strings, which is why we use things like strcmp(). Therefore the following code would be illegal in C and C++:

if("foo" == "foo"){
    printf("The C-style comparison worked.\n");
}

if("foo" == "bob"){
   printf("The C-style comparison produced the incorrect answer.\n");
} else {
   printf("The C-style comparison worked, strings were not equal.\n");
}

But I tested it in both Codeblocks using GCC and in VS 2015, compiling as C and also as C++. Both allowed the code and produced the correct output.

Is it legal to compare C-style strings? Or is it a non-standard compiler extension that allows this code to work?

If this is legal, then why do people use strcmp() in C?

  • 2
    The code is legal. It just does pointer comparison, which may not be what you intend. `strcmp()` does what most people expect - it compares the contents of the strings, rather than only the address of their first element. For your code, the result of `"foo" == "foo"` is implementation defined. – Peter Jan 01 '16 at 23:16

3 Answers3

12

The compiler is free to use string interning, i.e. save memory by avoiding to duplicate identical data. The 2 "foo" literals that compare equal must be stored in the same memory location in your case.

However, you should not take this as the rule. The strcmp method will work under all circumstances, whereas it is implementation defined whether your observation will hold with another compiler, compiler version, compilation flags set etc.

rems4e
  • 3,112
  • 1
  • 17
  • 24
  • Okay, that makes perfect sense. But then say I define two 'vector' and fill them with identical strings. If I use the generic algorithm 'equal' to compare the two containers I still get the correct answer. equal compares the the containers, the char * pointers can't possibly point to the same locations in memory if they are from two different vectors can they? – Walid Beydoun Jan 01 '16 at 23:27
  • They totally can point to the same memory. They are only pointers after all. If you copy the pointer itself from one vector to another, then it will reference the same memory, right?The same goes for assigning string literals, as they are decayed to a pointer value upon assignment. Is it clearer now? – rems4e Jan 01 '16 at 23:31
  • Yeah thanks for the help. One day I will wrap my head around pointers and fully understand them. Cheers mate. – Walid Beydoun Jan 01 '16 at 23:34
11

The code is legal in C. It just may not produce the result you expected.

The type of string literal is char[N] in C and const char[N] in C++, where N is the number of characters in the string literal. "foo" is type char[4] and const char[4] in C and C++ respectively. Basically it's an array. An array gets converted into a pointer to its first element when used in an expression. So in the comparison, if("foo" == "foo") the string literals get converted into pointers. Hence, the "address comparison".

In the comparison,

if("foo" == "foo"){

the addresses of the string literals are compared, which may or may not be equal.

It is equivalent to:

const char *p = "foo";
const char *q = "foo";

if ( p == q) {
 ...
 }

C standard doesn't guarantee that addresses are equal for two string literals with same content ("foo"'s here) are placed in same location. But in practice, any compiler would place at the same address. So the comparison seems to work. But you can't rely on this behaviour.

6.4.5, String literals (C11, draft)

It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.

Similarly, this comparison

if("foo" == "bob"){
 ...
}

is equivalent to:

const char *x = "foo";
const char *y = "bob";

if("foo" == "bob"){
  ...
}

In this case, the string literals would be at different locations and pointer comparison fails. So in both cases, it looks as if the == operator actually works for comparing C-strings.

Instead if you do comparisons using arrays, it will not work:

char s1[] ="foo";
char s2[] = "foo";

if (s1 == s2) {
  /* always false */
}

The difference is that when an array is initialized with a string literals, it's copied into the array. The arrays s1 and s2 have distinct the addresses and will never be equal. But in case of string literals, both p and q point to the same address (assuming the compiler places so - this is not guaranteed as noted above).

P.P
  • 117,907
  • 20
  • 175
  • 238
  • Your `p` and `q` example MAY work or not work. `char pp[] = "foo"; char* p = pp;`, etc would guarantee that it will compare as different. – Mats Petersson Jan 01 '16 at 23:28
  • @MatsPetersson I am just about add an example with *array.* But `p` and `q` is precisely equivalent to directly comparing string literals. (that both ways of comparison may or may not work). – P.P Jan 01 '16 at 23:30
  • @Mats Petersson - most importantly, it may or may not work in exactly the same way as in the original example. The p & q code is functionally equivalent. – simpleuser Jan 01 '16 at 23:55
  • I think [6.3.2 Other operands](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf) is apropos here to explain why the string literals effectively "decay" to a pointer: Except when it is the operand of the `sizeof` operator, the `_Alignof` operator, or the unary `&` operator, or is a string literal used to initialize an array, an expression that has type ‘‘array of *type* ’’ is converted to an expression with type ‘‘pointer to *type*’’ that points to the initial element of the array object and is not an lvalue. – Andrew Henle Jan 02 '16 at 00:25
  • @AndrewHenle I added a bit explanation for it. Hopefully, it's better now. – P.P Jan 02 '16 at 00:33
  • Thanks, these answers really helped me understand pointers and strings a bit better. – Walid Beydoun Jan 03 '16 at 13:57
0

it is copying/comparing the addresses of the the string, not the content of the strings.

comparing the addresses is a valid operation

user3629249
  • 16,402
  • 1
  • 16
  • 17