0

Is '\0' set automatically if I provide an extra element for it, but left it in the initialization string?

Like:

char a[6] = {"Hello"};   // <- Is NUL set here automatically?

I´ve did one experiment with C and C++:`

C:

#include <stdio.h>

int main()
{
    char NEWYEAR[16] = {"Happy New Year!"};
    printf("%s\n",NEWYEAR);

    return 0;
}

Output:

Happy New Year! 

C++:

#include <iostream>

int main()
{
    char NEWYEAR[16] = {"Happy New Year!"};
    std::cout << NEWYEAR << std::endl;

    return 0;
}

Output:

Happy New Year! 

The compilers did not threw an error or warning and the result is as desired. So it might seem to work correctly. But is that really true?

  • Is everything correct by doing so?
  • Is this maybe bad programming style?
  • Does this cause any issues?
alk
  • 69,737
  • 10
  • 105
  • 255
  • 3
    This is an area where C and C++ differs, so please pick *one* language. – Some programmer dude Jan 01 '20 at 11:04
  • @Someprogrammerdude I have tested it with both. The result seems to be equivalent. – RobertS supports Monica Cellio Jan 01 '20 at 11:05
  • 3
    "*Is this maybe bad programming style*" if you do not rely on the array having a specific size it's much saver to just do `char a[] = "Hello";`. The compiler will select the size to fit, including the `0`-terminator. The curly braces are not needed , BTW. – alk Jan 01 '20 at 11:05
  • @RobertS-ReinstateMonica It "works" because you test only with code that happens to be valid in both languages. C and C++, while sharing common roots and some syntax, really are two different languages with different rules and semantics. – Some programmer dude Jan 01 '20 at 11:06
  • But it is equally useful to understand that the curly braces are *allowed*, because without braces you cannot do a compound literal in C... – Antti Haapala -- Слава Україні Jan 01 '20 at 11:07
  • 2
    *"`NEWYEAR[16]`"* More suitable with the current date to have `NEWYEAR[20]` ;) – Jarod42 Jan 01 '20 at 11:07
  • @Someprogrammerdude Ok, I understand that. But where especially is the difference? Is this valid in C, but not in C++? The output results *seem* to work. – RobertS supports Monica Cellio Jan 01 '20 at 11:09
  • There are *vast* numbers of things that *may* work in certain situations in both C and C++. That doesn't make them a good idea since they may stop working in a different implementation or even next Tuesday :-) – paxdiablo Jan 01 '20 at 11:12
  • @paxdiablo So what I understand as conclusion is: As opposed to C, In C++, to incorporate a dedicated element for `\0` without initializing it, isn´t a good thing to do because the result is implementation-defined? – RobertS supports Monica Cellio Jan 01 '20 at 11:24
  • RobertS: That is not the difference between C and C++ here. The difference is that C allows you to write `char foo[3] = "foo";` and C++ does not. The C standard says "Successive bytes of the string literal including the terminating null character if there is room ..." (6.7.9/14), while C++ insists that "There shall not be more initializers than there are array elements." (11.6.2/2 with a clear example). Both of them consider the trailing `\0` to be part of the string literal, so it is guaranteed to be used as an initialization value (in the case of C, if there is room). – rici Jan 01 '20 at 21:07

2 Answers2

7

It is more complex than that

char a[6] = "Hello";

will initialize the array of characters to Hello\0, because Hello has an implicit terminating zero.

char a[6] = "Hello\0";

would be valid in C, but invalid in C++ because the literal is 7 characters long, having both an implicit terminator and explicit embedded null character. C allows the literal to drop the implicit terminator. C11 6.7.9p14:

  1. An array of character type may be initialized by a character string literal or UTF-8 string literal, optionally enclosed in braces. Successive bytes of the string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array.
char a[5] = "Hello";

would be valid C, resulting in a char array that does not contain a zero-terminated string. It is invalid in C++.

(emphasis mine). It means that the implicit terminating null is optionally added, if there is room in the array, but it does not need to.

And

char a[4] = "Hello";

in C would bring the literal Hell, because while it is a constraint violation in C (C11 6.7.9p2),

  1. No initializer shall attempt to provide a value for an object not contained within the entity being initialized.

attempting to initialize more elements than there are items in a list usually just generates a warning in many compilers and is then often ignored by programmers. The paragraph 14 does not have an exception for anything other besides the implicit terminator.

And lastly

char a[7] = "Hello";

in both C and C++ would result in a character array of 7 elements containing the characters Hello\0\0, because in an array having an initializer, the elements not explicitly initialized by the initializer will be default-initialized as if initialized by literal 0. In this case the first 6 elements will be initialized explicitly and the 7th implicitly.


Given the possibility of silently truncating the terminator in C, it is better to just omit the array size and write

char a[] = "Hello";

This will declare a as array of 6 elements, just like char a[6] = "Hello";, but you cannot mistype the array size.

  • That is exactly what I have searching for, to express possibilities and the differences between C and C++ in that case. I have 2 questions left: 1. You said `char a[6] = "Hello\0";` and `char a[5] = "Hello";`are invalid in C++. How may I declare then an char array in C++? Should I only do it like `char a[] = "Hello";` or is it allowed to initialize the characters separately with NUL as well like: char a[6] = `H`,`e`,`l`,`l`,`o`,`\0`;` 2. Shall I *always* do it like `char a[] = "Hello";? Do you code in this way only? – RobertS supports Monica Cellio Jan 01 '20 at 11:47
  • @RobertS-ReinstateMonica I choose the option 3: "I don't do no C++". – Antti Haapala -- Слава Україні Jan 01 '20 at 11:47
  • @AntiiHaapala :-) Ok, I understand. But back to what is valid and what not: Can I do char a[6] = `H`,`e`,`l`,`l`,`o`,`\0`; in C++ or is there anything wrong about it? – RobertS supports Monica Cellio Jan 01 '20 at 11:51
  • @Roberts: `char a[6] = {'H', 'e', 'l', 'l', 'o', 0};` is valid in both C and C++. The braces and apostrophes are obligatory. You can (in both languages) use `'\0'` instead of 0 if you wish. – rici Jan 01 '20 at 21:11
3

If there's space for the null-terminator then it will be added.

In C (but not C++) if the size of the array is the length of the string except the null-terminator, then the null-terminator will not be added. So e.g.

char a[5] = "Hello";

is valid, but there won't be a null-terminator in the array.

It's not valid to provide a smaller size than the string length.

Some programmer dude
  • 400,186
  • 35
  • 402
  • 621
  • Hmm, I wasn't *sure* about that last sentence, I thought it might just truncate. But you're right, C11 has the constraint: "No initializer shall attempt to provide a value for an object not contained within the entity". – paxdiablo Jan 01 '20 at 11:09