13

I'm writing a C compiler, and when I come to the implementation of the switch statement one constraint confused me a lot. Section 6.8.4.2p2 of the standard states:

If a switch statement has an associated case or default label within the scope of an identifier with a variably modified type, the entire switch statement shall be within the scope of that identifier.

With a footnote:

That is, the declaration either precedes the switch statement, or it follows the last case or default label associated with the switch that is in the block containing the declaration.

I can't really understand what this constraint means. Can some one give me an example?

dbush
  • 205,898
  • 23
  • 218
  • 273
  • be aware of the duffs device when considering what that is saying. A switch statement can contain more than cases. https://turbofuture.com/computers/Computer-Programming-Advanced-C – Gem Taylor Sep 06 '19 at 12:51
  • 1
    Possible duplicate of [What does the ISO/IEC 9899 6.8.4.2 ->2 phrase mean?](https://stackoverflow.com/questions/18459972/what-does-the-iso-iec-9899-6-8-4-2-2-phrase-mean) – Sander De Dycker Sep 06 '19 at 12:53
  • It's the same thing as with `goto` not being allowed to wildly jump across VLA declarations. `switch` is kind of a glorified `goto`. – Lundin Sep 06 '19 at 13:33
  • 3
    @SanderDeDycker It's a dupe but the answers here look promising. I'd hold off the close votes for now - maybe we should close the old post as a dupe of this one instead. – Lundin Sep 06 '19 at 13:35
  • @Lundin : sounds fair – Sander De Dycker Sep 06 '19 at 14:00
  • @Lundin: So posting dups rather than looking for existing answers is being endorsed? As long you generate more traffic? – dhein Sep 09 '19 at 06:32
  • @dhein It happens fairly often that someone posts a dupe without realizing and gets good answers before anyone realizes it is a dupe. At that point we have to weigh older posts against the new one and keep the one with highest quality. The oldest post isn't by definition the best post. – Lundin Sep 09 '19 at 07:29

3 Answers3

3

I think that this quote from the C Standard relative to the goto statement will help to understand the quote relative to the switch statement.

6.8.6.1 The goto statement

1 The identifier in a goto statement shall name a label located somewhere in the enclosing function. A goto statement shall not jump from outside the scope of an identifier having a variably modified type to inside the scope of that identifier.

In fact the swutch statement uses goto statements to pass the control to the selected label. So any such passing the control to a case label shall not skip a declaration of an object of a variable modified type. That is such a declaration either should be placed before a swict statement or inside the switch statement after all its labels.

And there is an example

goto lab3; // invalid: going INTO scope of VLA.
{
double a[n];
a[j] = 4.4;
lab3:
a[j] = 3.3;
goto lab4; // valid: going WITHIN scope of VLA.
a[j] = 5.5;
lab4:
a[j] = 6.6;
}
goto lab4; // invalid: going INTO scope of VLA.

that is the statements goto lab3; and goto lab4; are bypassing the declaraion double a[n];.

Here is an example of a valid switch statement according to the footnote.

#include <stdio.h>

int main(void) 
{
    int n = 2;

    switch ( n )
    {
    case 0:
        break;

    case 1:
        break;

    default: ;
        int a[n];
        for ( int i = 0; i < n; i++ ) a[i] = i;
        int sum = 0;
        for ( int i = 0; i < n; i++ ) sum += a[i];
        printf( "sum = %d\n", sum );
    }

    return 0;
}

The program output is

sum = 1
Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
2

What this is saying is that if one case is able to see a variably modified array, then the entire switch statement MUST be able to see it as well.

This means that the following code is legal:

void switch_test(int size)
{
    int array[size];
    ...
    // code to populate array
    ...
    switch (expr) {
    case 1:
        printf("%d\n", array[0]);
        break;
    case 2:
        // do something else
    }
}

But this code is not:

void switch_test(int size)
{
    switch (expr) {
    case 2:
        // do something else
        int array[size];   // ILLEGAL, VLA in scope of one label but not all
    case 1:
        ...
        // code to populate array
        ...
        printf("%d\n", array[0]);
    }
}

The reason the latter is illegal is because if the code were to jump to case 1 then array might not have been created properly since the size of a VLA is determined at run time. Ensuring that the VLA is visible before the switch statement avoids this issue.

dbush
  • 205,898
  • 23
  • 218
  • 273
  • I got a question, does this constraint suggest that a VLA should be create on heap instead of stack? Because if the VLA stays on stack, it will destroy when out of it scope –  Sep 06 '19 at 13:24
  • @reavenisadesk It's up to the implementation, but I would think that space would be allocated on the stack for a VLA, otherwise the implementation would need to perform heap management functions (i.e. `malloc`, `free`) behind the scenes which probably isn't a good idea. – dbush Sep 06 '19 at 13:28
  • Cause l’m using llvm and all values are abstract registers, I do not control stack pointer, does that mean this constraint do not matters in my situations? –  Sep 06 '19 at 13:31
  • In a conventional C implementation, VLA can never be created "on the heap". This would break properties with respect to signal handlers and would leak memory on `longjmp` out of the block the VLA's lifetime is tied to. You would need a fairly unconventional implementation model to overcome these issues. – R.. GitHub STOP HELPING ICE Sep 06 '19 at 14:07
1

If a switch statement has an associated case or default label within the scope of an identifier with a variably modified type, the entire switch statement shall be within the scope of that identifier.

void func(int a) {
    switch(a) {
        { // this is the scope of variable b
             // b is an identifier with variably modified type
             int b[a];
             // the following is the "associated case within a scope of an identifier"
             case 1: // this is invalid
                 break;
        }
        // the scope of switch statement has to be within the scope of that identifier
    }
}

It is so because compiler may need to emit cleanup instructions after allocating memory for VLA variable. In switch you can jump anywhere in the code, so instructions for allocating or cleaning the memory for a variable length array could be omitted.

KamilCuk
  • 120,984
  • 8
  • 59
  • 111
  • 1
    More specifically: VLAs are implemented by adjusting the stack pointer by their size. Allocation decrements the stack pointer, deallocation reincrements it by the same amount. Now, what would happen if either the decrement or the increment is skipped? Pink elephants appear! (AKA undefined behavior via stack corruption, might also encrypt your hard drive...) – cmaster - reinstate monica Sep 06 '19 at 13:12
  • @cmaster what makes a VLA so special? Considering I have a struct allocated on stack, the stack pointer omitted situation looks the same for me –  Sep 06 '19 at 13:49
  • @cmaster: That's a rather naive model. It's statically known at each statement in the function whether any given VLA is live; there's no reason except for exceedingly bad implementation quality to end up with mismatched stack adjustments. The problem is just that you cannot consistently define the size of the current instant of the VLA when its declaration was skipped. – R.. GitHub STOP HELPING ICE Sep 06 '19 at 14:10
  • @reavenisadesk That their size is not known at function entry. All fixed-size automatic variables are collected by the compiler at compile time, the total amount of automatic memory needed by the function is determined, and then the stack pointer is manipulated exactly once at function entry to create a *stack frame* and once at function exit to deallocate that stack frame. VLAs cannot be part of the stack frame due to their unknown size, and thus require stack pointer manipulation at their point of definition and the respective scope end. – cmaster - reinstate monica Sep 06 '19 at 14:12
  • @cmaster though it sounds weird, what if I allocate structs in a for-loop? I think the compiler can’t determine the space which is needed at once. –  Sep 06 '19 at 14:18
  • @R.. Ok, you might phrase it differently. Yes, you can say that the key part is, that you cannot skip over the creation of the VLA if you need to use it, simply because its size would be undefined. And removal of the VLA from the stack does classify as a use. I did my best to make it obvious that skipped-over VLA definitions are ... *dangerous*. – cmaster - reinstate monica Sep 06 '19 at 14:23
  • @reavenisadesk The compiler has no problems with a `struct` inside a loop: If I say `for(...) { struct Foo foo = ...; ... }`, the compiler knows that there is only ever at most one variable `foo` alive, and will reserve `sizeof(struct Foo)` bytes for it at function entry. Every instance of `foo` will reuse these same bytes. – cmaster - reinstate monica Sep 06 '19 at 14:28
  • @reavenisadesk: You can't "allocate structs in a for loop" with automatic storage. For any object declared with automatic storage, there is exactly one instance of it live at any time per executing instance (possibly nested/recursive or different-thread) of the block scope it lives in. Running a loop multiple times does not create multiple objects for things declared inside it because the lifetime of the old one ends before the lifetime of the new one starts. Incidentally this is why VLA is a **strictly weaker** feature than the nonstandard `alloca`. – R.. GitHub STOP HELPING ICE Sep 06 '19 at 14:28
  • @cmaster so except the VLA, compiler will always determine a fixed size of space before enter a function, right? –  Sep 06 '19 at 14:32
  • @R.. yeah, I thought a second time and feels this example is not very appropriate. –  Sep 06 '19 at 14:34
  • 1
    @reavenisadesk "so except the VLA, compiler will always determine a fixed size of space before enter a function, right?" A common implementation is for the function to determine its required stack space (apart from VLAs) on entry to the function, but it is an implementation detail. – Ian Abbott Sep 06 '19 at 14:54
  • 1
    @reavenisadesk As Ian Abbott said: Typical compilers do this, but the language standard does not require it. Of course, you can write a compiler that would emit code to decrement the stack pointer whenever a variable is defined, and then reincrement it appropriately at the respective scope end. However, the assembly produced by such a compiler would be *slow*. It's much faster to fuse all the automatic memory allocations at compile time and create a single, large stack frame once at function entry, so virtually all compilers do this. – cmaster - reinstate monica Sep 06 '19 at 15:01
  • @cmaster that’s what confused me, language do not care the implementation of VLA, but all the special rules about VLA makes me feel that the standard suggest the compiler implementation to be the once for all spaces way, so that a VLA cannot be part of the space. If the standard do not care the implementation, as I use the ‘alloc and dealloc when needed’ strategy, and as far as I considered, rules related with VLA should have no difference with a struct. Do I have the right feelings? –  Sep 06 '19 at 15:16
  • @reavenisadesk While the standard is not worded like it cared about implementations, it does care a lot about being implementable. The special rules about VLAs are a mark of this. Each one is phrased in an implementation agnostic way, but I guess there is an excellent reason for each of these special rules that has to do with making the VLA feature actually implementable. In the case of the `switch`-VLA requirements the special rule means that no well-formed program can skip over a VLA definition, and thus no special handling is required on the side of the compiler to avoid stack corruption. – cmaster - reinstate monica Sep 06 '19 at 15:53