2

Let's say I have this C code

f((char []){ "hello" });
f((const char []){ "hello" });
f("hello");

In all three cases, hello is copied into the function. a pointer to the first character of the char array is initialized as the function parameter.

I know that in C, the string literal corresponds to char [] while in C++ the string literal corresponds to const char [], but will it create the same code as with char [] or const char []?

In a C program, could you exchange all occurences of "string" with (char []){ "string" } and get the same result on the assembly level?

Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
hgiesel
  • 5,430
  • 2
  • 29
  • 56
  • 3
    *In all three cases, hello is copied into the function.* No, the pointer to that object is copied into the function. The pointer in this case is always a pointer to a character. – 2501 Feb 21 '16 at 19:01
  • 2
    @iharob - Not true: `sizeof("foobarbaz")` is 10, not 4 (or 8). – Oliver Charlesworth Feb 21 '16 at 19:09
  • C is pass by value and you cannot pass an array to/from a function, but only a pointer to the first elelement. So your prerequisites are wrong already. And from a wrong prerequisite anything can follow. – too honest for this site Feb 21 '16 at 19:12
  • "I know that in C, the string literal corresponds to char [], while ..." - wrong! The standard does not explicitly state a _string literal_ is `const char []`, it also states modifying it is undefined behaviour. Thus technically it **is** `const char []`. Reason it is not legally is very likely something like `char *ch = "Hello"` to provide an initial value for a pointer. – too honest for this site Feb 21 '16 at 19:14
  • Okay, would it be more correct to say: The parameters from the `f()` (whatever that may be) are copy constructed using the argument `"hello"`, and you cannot construct arrays this way, as they will always decay into pointers. – hgiesel Feb 21 '16 at 19:22
  • @henrikgiesel: I think my last comment clearly shows why that is not the same. And that is not related to arrays, but to compound literals in general (just the array/pointer conversion is specific). That is quite inefficient, btw, due to all the copying involved. I somehow do not get what you are up to. It that a solution searching for a problem or do you have an actual problem you try to solve? – too honest for this site Feb 21 '16 at 19:27
  • 2
    @Olaf technically it is NOT `const char []`. Otherwise modifying it (without a cast) would have to give a compilation error. "non-modifiable" and `const` have different meanings – M.M Feb 21 '16 at 19:31
  • @M.M: I still think I used the correct terms. A typical implementation stores string literals in the same memory area (e.g. Flash for MCU systems) it stores `const` variables. If you try to modify it, there is UB, much like for `const` variables. So the difference is more legally: as a pointer to a string literal is `char *`, not `const char *`. And the standard does not mention "non-modifyable", but clearly states it is UB to modify them - see 6.4.5p7 . Much the same as for `const` qualified objects (6.7.3p6). – too honest for this site Feb 21 '16 at 19:42
  • @Olaf I'm trying to understand the underworkings of C and differences to C++, I have no actual problem I want to solve in this case. I just came across `f(char *a) { … }; f("abc");` being able to compile in C, but not in C++ what really confused me. I know it is a really vague question (and maybe not a good question for stackoverflow), but this kind of question usually offer some really interesting discussions in my opinion. – hgiesel Feb 21 '16 at 19:54
  • @henrikgiesel maybe you should have asked exactly that as your question – M.M Feb 21 '16 at 19:55
  • @henrikgiesel: Problem is in the last part of your comment: SO is not for discussions. – too honest for this site Feb 21 '16 at 20:01
  • 2
    @M.M: It will not generate the same code, because the resulting arrays will have to be copied from the original literals and placed on the stack instead of the `.rodata` (or similar) section. It is just a bad idea imho. – too honest for this site Feb 21 '16 at 20:04
  • @Olaf string literals are not required by the C standard to be stored in rodata, that's just a popular implementation. I have removed my comment though as it is extremely theoretical (no real implementation would ever do things that way, even though it is possible) – M.M Feb 21 '16 at 20:10
  • @M.M: The generated assembler code is also implementation specific. You are bascially correct,. the standard does not enforce anything thelike. But if we strictily stick to the standard, we cannot even mention assembler code, the stack, or anything else. I just followed the track opened by OP asking for assembler code for a typical implementation on target like ARM or x86 to keep things simple (which is uncommon for me - I know;-). – too honest for this site Feb 21 '16 at 20:15

1 Answers1

4

I can not say what code will be generated but according to the C Standard (6.5.2.5 Compound literals)

7 String literals, and compound literals with const-qualified types, need not designate distinct objects.

And there is an example

13 EXAMPLE 6 Like string literals, const-qualified compound literals can be placed into read-only memory and can even be shared. For example,

(const char []){"abc"} == "abc"

might yield 1 if the literals’ storage is shared.

As you correctly mentioned string literals in C have types of non constant character arrays. However they may not be modified. On the other hand a compound literal without the qualifier const can be modified. This can influence on the generated assembly code. Or if to use the qualifier const with a compound literal then the resulted type differs from the type of the corresponding string literal. So again this can influence on the generated assembly code.

Another difference is that string literals in any case have the static storage duration but a compound literal can have the automatic storage duration. And this also can influence the generated assembly code.

Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
  • The statement about modification resulting in UB is much the same for string literals and `const` qualified objects. The main difference is that the string literal is converted to a (not `const` qualified) `char *`, while e.g. `const char i; &i;` yields a `const char *`. – too honest for this site Feb 21 '16 at 20:20