3

I know how to initialize strings in C in two different ways.

char str[] = "abcd";

char str[] = {'a','b','c','d','\0' };

Clearly writing "abcd" instead of {'a','b','c','d','\0' } is time saving and comfortable.

Are there any particular reason that we can initialize the string like in the last line? Is it useful in some context or is it superfluous?

Infinity_hunter
  • 157
  • 1
  • 7
  • No, it's pointless. Both lines you show are __strictly__ equivalent. – Jabberwocky Sep 07 '21 at 06:36
  • 1
    The second form works for any aggregate initialization. The first form is special for char arrays – M.M Sep 07 '21 at 06:39
  • 1
    In this case they are equivalent, however you couldn't do `char* str = {'a','b','c','d','\0' };` but you could do `char* str = "abcd";` because the latter first creates an unnamed array, then assigns the `str` pointer to it, while the former is only a list of characters waiting to be initialized as an array. – George Sep 07 '21 at 06:41
  • Do you mean, "Is it pointless to use it?", or, "Was it pointless for the language to provide it?"? It was not pointless for the language to provide it, because it's obviously the ordinary and general way of initializing arrays of any type. Whether it's ever useful to initialize a true string is up to you. (Me, I rarely if ever use it.) – Steve Summit Sep 07 '21 at 11:55

2 Answers2

5

In general terms, no there isn't a difference. With some advanced exceptions, as follows...

  • In some cases you want a character array which is not a valid C string - that is, not null terminated. Then the latter form must be used. There's only some very special cases where such strings are needed though: very ancient Unix code and certain embedded systems implementations.

  • Another advantage of the latter form is better escape sequences. Escape sequences for string literals is a bit broken. Suppose I wish to print an extended symbol with code 0xAA and then the string "ABBA". puts("\xAAABBA"); won't compile since it takes everything behind the \x as hex. char str[] = {'\xAA','A','B','B','A','\0'}; puts(str); works fine though. (You can however split the string literal in several too, to achieve the same: "\xAA" "ABBA".)

  • Initialization with string literals also has the subtle but severe C language flaw discussed here:
    Inconsistent gcc diagnostic for string initialization

Lundin
  • 195,001
  • 40
  • 254
  • 396
3

The char str[] = {'a','b','c','d','\0' } form is rarely useful for initializing real strings, it's true.

I once wrote some code to filter out accidentally obscene words from generated text, and in a fit of self-censorship, I initialized the array of words to look for in a way that wouldn't offend the eyes of any future maintenance programmer:

char badwords[][5] = {
    { 's', 0x68, 105, 0164, '\0' },
    { 'p', 0151, 0x73, 115, '\0' },
    { 'c', 117, 0156, 0x74, '\0' },
    { 'f', 0x75, 0143, 107, '\0' },
    /* ... */
};
Steve Summit
  • 45,437
  • 7
  • 70
  • 103
  • 2
    Extra points for obfuscating it even more by mixing octal, decimal and hex. :-) – Ted Lyngmo Sep 07 '21 at 14:25
  • @SteveSummit: sorry for being so explicit, how about portability to non-ASCII character sets? I cannot think of anything simple achieving the same level of obfuscation. – chqrlie Sep 17 '21 at 18:53
  • @chqrlie No worries — I was only joking, of course. Perhaps I should apologize for not taking your question seriously! I can't think of anything simple, either — the best I can come up with is a scheme in which any upper-case letter is to be decrypted à la [Caesar](https://en.wikipedia.org/wiki/Caesar_cipher), taking care to steer clear of the discontinuities in EBCDIC. That is, something like `char badwords[] = { "shiW", "piVs", "cXnW", "fXFk"}`, to be decoded at runtime with something based on `if(isupper(*p) *p = tolower(*p-3)`. (Or maybe just use rot13.) – Steve Summit Sep 17 '21 at 19:15
  • 1
    @TedLyngmo I can't claim credit — I stole the idea from Sjörd Mullender's winning entry in the [First Annual IOCCC](https://www.ioccc.org/years.html#1984). – Steve Summit Sep 17 '21 at 19:34
  • @SteveSummit: that would work, but then still not portable to all potential character sets... If decoding at runtime is OK, then a simple permutation might do. In either case, no longer an illustration of the OP's question :) – chqrlie Sep 17 '21 at 19:39