7

Probably, there is a contradiction is the C standard for VM types used in conditional operator. Assume:

int f(void) { return 42; }

Now, what is the type of a following expression?

1 ? 0 : (int(*)[f()]) 0

It's value must be equal to NULL but I am not sure what is the type. The int(*)[f()] is a pointer to VLA of size f(). However, to complete this VM type the size expression f() must be evaluated. The problem is that it belong to a branch of ternary operator which is not evaluated. From 6.5.15p4:

The second operand is evaluated only if the first compares unequal to 0; the third operand is evaluated only if the first compares equal to 0

Rules of the conditional operator require the combination of 0 pointer be the type of the other branch 6.5.15p6:

... if one operand is a null pointer constant, the result has the type of the other operand; ...

How to solve this contradiction? Possible solutions are:

  • int(*)[f()] - the f() is evaluated anyway
  • int(*)[] - the array type stays incomplete
  • undefined behavior
  • something else?

The rules of composite type suggest that this may be UB but I am not sure if those rules apply for this case. I look for an answer that cites the specification of C17 but wording from upcoming C2X is fine as well.

tstanisl
  • 13,520
  • 2
  • 25
  • 40
  • 2
    I think this somehow falls under the general problem with VLA 6.7.6.2/6 "If the two array types are used in a context which requires them to be compatible, it is undefined behavior if the two size specifiers evaluate to unequal values." For example consider `_Generic (1 ? 0 : (int(*)[f()]) 0, int(*)[666]: puts("clearly the size 666"));`. Not only does the puts() get executed, f() was _not_ executed. Instead it treats everything as if they were pointers to arrays of incomplete size. – Lundin Nov 15 '22 at 10:52
  • @Lundin, I agree that it is something similar. However here is a combination of 0 pointer constant with a pointer to an array. Not the combination of two pointers to arrays. I just hope that there is some wording in *any* of C standards may give a hint how this situation should be resolved. – tstanisl Nov 15 '22 at 11:01
  • @Lundin, on the other hand if VM types were used on both sides (i.e. `1 ? (int(*)[f()]) 0 : (int(*)[g()]) 0`) then *both* size expressions should be evaluated. Otherwise it would not be possible to apply the mentioned rule for VM types. Something that does not exist cannot be compatible with something that exists. – tstanisl Nov 15 '22 at 11:08
  • The type is `int(*)[f()]`. The expression (not its value) is a part of the type. – n. m. could be an AI Nov 15 '22 at 11:15
  • 1
    @tstanisl Hmm, that example caused gcc to melt. https://godbolt.org/z/78caETd8v "warning: right-hand operand of comma expression has no effect" Eeeh? Apparently if the 2nd operand of ?: is a pointer to VLA and only then, gcc diagnostics will go nuts. – Lundin Nov 15 '22 at 11:17
  • @Lundin, clang complains about "unused result" :) – tstanisl Nov 15 '22 at 11:23
  • @Lundin if two VM types are used in a context that requires them to be compatible, and the size expressions evaluate to different values, the behaviour is undefined. – n. m. could be an AI Nov 15 '22 at 11:23
  • @n.m. Yes I already quoted that but I don't quite see how it applies to for example `1 ? (int(*)[f()]) 0 : 0;`, which causes gcc diagnostics to go bananas. – Lundin Nov 15 '22 at 11:54
  • Hypothesis: There is no contradiction here; one statement says the third operand is not evaluated, and the other says the second operand has the type of the third operand. This is not a contradiction because it does not say the second operand has some impossible type. It merely leaves the second operand having an unknown type. That is just an omission, not a contradiction. All the information the C implementation has about the type after evaluation of the expression is the same as it has about the type before evaluation of the expression: It is a pointer to a variable length array. – Eric Postpischil Nov 15 '22 at 12:01
  • @Lundin it doesn't. It is simply a gcc bug. There is no comma expression in sight. – n. m. could be an AI Nov 15 '22 at 12:02
  • 2
    Heh, for `int foo(void); int bar(void) { return sizeof *(1 ? 0 : (char (*)[foo()]) 0); }`, [Clang 15 crashes and requests a bug report](https://godbolt.org/z/s98oGWMbK). It does not know the size either. – Eric Postpischil Nov 15 '22 at 12:04
  • Adding to my comment, I think this omission is unintentional. – Eric Postpischil Nov 15 '22 at 12:29

2 Answers2

4

It looks that it is Undefined behavior.

The similar question was asked to C standardization committee in Defect Report: VLAs and conditional expressions. The question was about following code:

int r = (c1()
               ? (z1(), (int (*)[d1()])p1())
               : (z2(), (int (*)[])p2()))[a][b];

asking:

The type of the conditional expression involves the size expression d1() that's only evaluated in one part of the expression, and this information is needed to evaluate the array reference even when c1() returns false.

To my understanding, dereferencing (...)[a][b] requires the type of conditional expression which requires evaluation of d1() even though it does not come from the evaluated branch.

The answer from the committee can be found in minutes from London 2007 meeting, page 21 saying:

There are no rules for specifying a composite type when it depends on an expression and the expression is not evaluated. The proposal makes an implicitly undefined behavior explicitly undefined.

General consensus is that implicitly undefined is satisfactory, and we should issue an RR stating this (therefore making it explicit but only in an external document).

Proposed Committee Response: The standard does not speak to this issue, and as such the behavior is undefined.

tstanisl
  • 13,520
  • 2
  • 25
  • 40
  • A simple question: what does RR stand for? – pmor Mar 03 '23 at 11:09
  • @pmor, I don't know for sure. I guess it is "record of response". One can find this term of [WG14](https://www9.open-std.org/JTC1/SC22/WG14/) site. – tstanisl Mar 03 '23 at 22:39
2

I think this boils down to what 6.5.15 p4 means with not being evaluated. Paragraph has a footnote 112:

112)A conditional expression does not yield an lvalue.

We don't really need the lvalue of null pointer (int(*)[f()]) 0; type is enough.

Given above, if we don't count non-lvalue generating partial evaluation as the evaluation, then there is no contradiction, f() can be evaluated and type can be int(*)[42].

user694733
  • 15,208
  • 2
  • 42
  • 68