7

See this test program:

#include <stdio.h>
#include <string.h>

int main(int argc, char *argv[])
{
  if (argc < 2)
    goto end;

  char s[strlen(argv[1]) + 1];
  strcpy(s, argv[1]);
  printf("s=%s\n", s);

end:
  return 0;
}

It fails to compile with the error "jump into scope of identifier with variably modified type" (see other question).

However it compiles fine if I change the declaration of s into this (and include alloca.h):

char *s = alloca(strlen(argv[1]) + 1);

Why does the C standard allow jumping into the scope of an object created with alloca, but not a variable length array? I thought they were equivalent.

Tor Klingberg
  • 4,790
  • 6
  • 41
  • 51
  • The hand-waving answer is because branching over the scope of the VLA will mess up the stack, but `sizeof(char*)` is fixed. I've added the lawyer tag to increase the chance of your getting a decent answer. – Bathsheba Jun 02 '17 at 15:33
  • Thanks, language-lawyer definitely applies. If I have to guess, it would be that using `s` after jumping past the `alloca` is already UB as `s` will be an uninitialized pointer, but using `s` after jumping past the VLA declaration would be ok if the jump itself wasn't forbidden. – Tor Klingberg Jun 02 '17 at 15:37
  • 1
    `alloca` isn't in the standard, so the compiler may treat it like any other function (though they [often don't](http://man7.org/linux/man-pages/man3/alloca.3.html#NOTES)), in which case it's all fine according to the standard. Though, it may of course lead to similar undefined behaviour – Kninnug Jun 02 '17 at 15:38
  • @Kninnug: that's an important point. OP, could you recast the question without that function? – Bathsheba Jun 02 '17 at 15:40
  • I had forgotten that `alloca` is not a standard function. This question is really about the contrast between VLA:s and `alloca`. The VLA-only question already exists (here)[https://stackoverflow.com/questions/20654191/c-stack-memory-goto-and-jump-into-scope-of-identifier-with-variably-modified]. – Tor Klingberg Jun 02 '17 at 15:43
  • Love the spelling "rally". That's how my exquisitly posh dear lady wife says the word. – Bathsheba Jun 02 '17 at 15:43

2 Answers2

2

It is because the compiler must runtime-initialize the frame of the scope with VLAs. In other words, you tell it to jump to address :END but you ask it to jump over the initialization code of the frame of that scope.

The code for initializing the space for the VLA is just before the expression that computes the length of the VLA. If you skip that code, which some goto can do, all the program will segfault.

Imagine something like:

if (cond) goto end;
...
char a[expr];
end:
a[i] = 20;

In this case the code will simply segfault, as you jump to the mutator of VLA a but a was not initialized. The code for initializing a VLA must be inserted in the place of the definition.

Now about alloca. The compiler will do the same, but it is not able to detect a segfault.

So this will segfault, with no warning/error from the part of the compiler.

The logic is the same as for VLA.

int main(int argc, char *argv[])
{
    goto end;
    char *s = alloca(100);
end:
    s[1] = 2;
    return 0;
}

In ISO 9899 this is why they inserted the statement:

6.8.6.1 The goto statement -- Constraints

1 The identifier in a goto statement shall name a label located somewhere in the enclosing function. A goto statement shall not jump from outside the scope of an identifier having a variably modified type to inside the scope of that identifier.

The compiler cannot detect during the static analysis the correct answer for this problem, as this is actually the halting problem.

alinsoar
  • 15,386
  • 4
  • 57
  • 74
  • So there's no technical reason (the `alloca` might as well have been a `malloc`) for banning it, simply just that the C standard tries to make VLAs amenable to static analysis, probably on the basis of them being statically declared (but they aren't completely amenable to it anyway, given their dynamic size). It seems to me like standardizing `alloca` instead of VLAs would have been a much better decision. – Petr Skocik Jun 02 '17 at 16:09
  • 1
    @PSkocik In my opinion this restriction imposed by the definition of the language is a `brute-force` way to avoid the `halting-problem`. – alinsoar Jun 02 '17 at 16:11
2

Besides the issue with deallocating the VLA if the program jumped after its declaration, there is also an issue with sizeof.

Imagine your program was extended with this:

end:
    printf("size of str: %zu\n", sizeof s);
    return 0;
}

For the alloca version, sizeof s == sizeof(char*), which can be computed at compile-time and all is well. However, for the VLA version, the length of s is unknown and sizeof s cannot be computed.

Kninnug
  • 7,992
  • 1
  • 30
  • 42
  • Your point is good, I want just to add that the value of `sizeof` is computed in fact not at the place of the `sizeof` call by in the place of the declaration and the value of sizeof is stored also somewhere in the stack/some register, so jumping just before the sizeof and jumping over the initialization code at the place of the definition makes sizeof print an enormity, etc. – alinsoar Jun 02 '17 at 16:20
  • @alinsoar: Can you provide any evidence for your claim about how the size of the VLA is stored? – Jonathan Leffler Jun 02 '17 at 17:41
  • int a[m] ; m++ ; sizeof(a) ; the size is computed at definition, so stored somewhere. – alinsoar Jun 02 '17 at 17:53