13

I have a hard time understanding sizeof's behaviour when given a ternary expression.

#define STRING "a string"

int main(int argc, char** argv)
{
  int a = sizeof(argc > 1 ? STRING : "");

  int b = sizeof(STRING);
  int c = sizeof("");

  printf("%d\n" "%d\n" "%d\n", a, b, c);

  return 0;
}

In this example (tested with gcc 4.4.3 and 4.7.2, compiled with -std=c99), b is 9 (8 characters + implicit '\0'), c is 1 (implicit '\0'). a, for some reason, is 4.

I would expect a to be either 9 or 1, based on whether argc is greater than 1. I thought maybe the string literals get converted to pointers before being passed to sizeof, causing sizeof(char*) to be 4.

I tried replacing STRING and "" by char arrays...

char x[] = "";
char y[] = "a string";
int a = sizeof(argc > 1 ? x : y);

... but I got the same results (a=4, b=9, c=1).

Then I tried to dive into the C99 spec, but I did not find any obvious explanation in it. Out of curiosity I also tried changing changing x and y to other types:

  • char and long long int: a becomes 8
  • both short or both char: a becomes 4

So there's definitely some sort of conversion going on, but I struggle to find any official explanation. I can sort of imagine that this would happen with arithmetic types (I'm vaguely aware there's plenty of promotions going on when those are involved), but I don't see why a string literal returned by a ternary expression would be converted to something of size 4.

NB: on this machine sizeof(int) == sizeof(foo*) == 4.

Follow-up

Thanks for the pointers guys. Understanding how sizeof and ?: work actually led me to try a few more type mashups and see how the compiler reacted. I'm editing them in for completeness' sake:

foo* x = NULL; /* or foo x[] = {} */
int  y = 0;    /* or any integer type */

int a = sizeof(argc > 1 ? x : y);

Yields warning: pointer/integer type mismatch in conditional expression [enabled by default], and a == sizeof(foo*).

With foo x[], bar y[], foo* x, bar* y or foo* x, bar y[], the warning becomes pointer type mismatch. No warning when using a void*.

float x = 0; /* or any floating-point type */
int   y = 0; /* or any integer type */

int a = sizeof(argc > 1 ? x : y);

Yields no warning, and a == sizeof(x) (that is, the floating-point type).

float x = 0;    /* or any floating-point type */
foo*  y = NULL; /* or foo y[] = {} */

int a = sizeof(argc > 1 ? x : y);

Yields error: type mismatch in conditional expression.

If I ever read the spec completely I'll make sure to edit this question to point to the relevant parts.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Peniblec
  • 437
  • 5
  • 18

3 Answers3

15

You have to understand expressions, which are the core component of the language.

Every expression has a type. For an expression e, sizeof e is the size of the type of the value of the expression e.

The expression a ? b : c has a type. The type is the common type of the two operand expressions b and c.

In your example, the common type of char[9] and char[1] is char * (both array-valued expressions decay to a pointer to the first element). (In C++, the rules for string literals are different and there is a const everywhere.)

Kerrek SB
  • 464,522
  • 92
  • 875
  • 1,084
  • 1
    Could you point to the parts of the spec that say 1) "the type of expression `a?b:c` is the common type of `b` and `c`" 2) "the common type of two arrays is a pointer" ? This does seem logical, but I'd really like to actually see that written in some official form. – Peniblec Jul 24 '14 at 11:49
  • 3
    @Peniblec This is already covered on SO @ http://stackoverflow.com/questions/8535226/return-type-of-ternary-conditional-operator – Captain Giraffe Jul 24 '14 at 11:51
  • Okay, 6.5.15 explains the "conversion" part (§6 indeed says "if both 2nd and 3rd operands are pointers then expression has type pointer"), but does not mention arrays. I *think* 6.3.2.1, §3 is what I'm after? ("expression of type "array of T" is converted to "pointer to T" except when given to sizeof, &, or when it's a string literal used to initialize an array") – Peniblec Jul 24 '14 at 12:14
  • 1
    @Peniblec: Yes, they both decay to `char *` (not to `const char *`, btw, although they should IMO). You cannot build the composite type of two non-variable arrays both with a different length given (but that's irrelevant here, an array as operand of `?:` decays to a pointer). – mafso Jul 24 '14 at 12:21
  • @mafso: Ah, of course, in C it's `char *`. In C++ it's `const char *` as far as I know. – Kerrek SB Jul 24 '14 at 13:55
  • 1
    Ah, yes, didn't think of C++, where it's indeed `const char *`… But I still think, your answer is a little misleading. I don't know, if “common type” is a C++ term, but for C, there is _compatible type_ and if two types are compatible, they have a _composite type_. `char [9]` and `char [1]` are incompatible, so there is no composite type. The expression is valid only because they both decay to `char *` before the types are compared (and `char *` is compatible to `char *`). Your answer could lead to the impression, that `sizeof(*(argc>1 ? &STRING, &""))` would be valid, which it isn't. – mafso Jul 24 '14 at 14:17
  • You make it sound as if the pointer decay happens after the determination of a common type, but it happens before. You could have said “the common type of char[9] and char[9] is char*”, and it would have been equally misleading. – Pascal Cuoq Jul 24 '14 at 14:58
  • 1
    @PascalCuoq: That's because I was originally thinking of C++, where `foo() ? "Hello" : "World"` does indeed have type `char const [6]`. – Kerrek SB Jul 24 '14 at 15:38
7

You need to understand that sizeof is entirely a compile-time operator. With VLA it could return a variable expression, otherwise it is a compile-time constant.

What matters is the type of its argument.

So in sizeof(argc > 1 ? STRING : "") the condition is not evaluated. The type of the argument is decayed to const char*. And on your machine, it is 4.

You should code instead (argc > 1)?sizeof(STRING):1

Since STRING is macro-expanded to the "a string" literal, sizeof(STRING) is 9, nearly as if you have declared

const char STRING[] = {'a',' ','s','t','r','i','n','g','\0'};
Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
2

Both STRING and "" are array objects of types char[9] and char[1] respectively. In C language, when array objects are used in expressions, they get implicitly converted (decay) to pointer types in almost all contexts, with few well-known specific exceptions.

One of such exceptions is sizeof operator. When you use an array object as an immediate operand of sizeof that array object does not decay to pointer type, and you get the size of the entire array in bytes as result. This is why sizeof(STRING) is equivalent to sizeof(char[9]) and evaluates to 9. And sizeof("") is equivalent to sizeof(char[1]) and evaluates to 1.

But when you use array objects as operands of ?: operator, the context is no longer exceptional. In context of ?: operator arrays immediately decay to pointers. This means that your sizeof(argc > 1 ? STRING : "") is equivalent to sizeof(argc > 1 ? (char *) STRING : (char *) ""), and in turn equivalent to sizeof(char *). This evaluates to pointer size on your platform, which just happens to be 4.

AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765
  • 1
    This nicely wraps everything I learned from reading the spec after Kerrek SB's answer! May I suggest including references when making such assertions? I think it would add educational value (as in, "This document is not a huge behemoth full of gibberish only language designers can hope to understand; here are the sections relevant to your problem. As you can see, it's plain English and describes precisely the mechanisms responsible for what you observed"). Give man a fish, and all that :) NB: I already dug up some paragraphs I thought were relevant, cf. comments below Kerrek SB's answer. – Peniblec Jul 25 '14 at 06:32