Context
Character string literals are and have always been non-const in C. The current standard draft n1570 says in 6.4.5/6:
The multibyte character sequence [resulting from concatenation of adjacent string literals, -ps] is then used to initialize an array of static storage duration and length just sufficient to contain the sequence. For character string literals, the array elements have
type char [and not const char, -ps].
The reason is, of course, that originally they were indeed usually writable. The program itself was writable; there was even self-modifying code. That is related because the string literals are produced and stored "together with the program" by the compiler.
It is modern memory management -- i.e., a matter of advanced machine architecture -- which makes it at all possible to generate a hardware exception when the program's memory is accessed. It is a matter of security to use that possibility. Not all architectures (can) do that, even today, and compilers may have options to control where strings go (e.g. -fwritable-strings
with old gccs).
This Code
Grammatically the code is compliant, semantically it is UB per 6.4.5/7 in n1570: "If the program attempts to modify such an array, the behavior is
undefined."
Compilers could warn when addresses of string literals are assigned to non-const variables (or used to initialize non-const parameters in function calls), but the common ones I tried don't warn which puzzles me a bit -- a lot of implemented warnings seem less important and noisier.
strcpy()
As to the specifics of strcpy()
: Some comments said that "the compiler doesn't know what strcpy()
does". That is more often than not misleading:
- The standard library functions are well-defined by the standard. This knowledge could be used in the compiler. For example, tools like
lint
usually know about such semantics.
- The compiler and the default standard library are often developed in close cooperation and come as a bundle; because there is a great deal of interaction between compiler and library with regard to compiling the library itself, compiler bootstrapping etc. there is usually an exchange between both projects on a regular basis.
- Compilers are free to replace library functions with intrinsics, which would give them very intimate knowledge.
gcc
Indeed, gcc happens to replace strcpy
and many other functions with built-ins, so it does have first-hand information that the first address will be written to. It just does not use it.
Another gcc intrinsic is printf()
, and here the compiler uses its knowledge of printf
's semantics to warn about format errors. That clearly demonstrates that a warning would be possible for strcpy()
as well.
As an aside, gcc does warn about "abc"[1] = 0;
. That is interesting because I had thought that the strcpy()
intrinsic would be inlined (it must be short) so that with -O3
and possibly -flto
at some point the equivalent of "destination"[i] = "Source"[i];
would actually be visible to the compiler and trigger that same warning.
Other Compilers
I tested VC 2013, gcc 5.3.0, gcc 4.7.2 and clang 3.7.1. None of them emits a warning for passing a string literal to strcpy()
, but cremno pointed out that VC offers the /analyze
option which catches the error.