Variably-modified types compatibility and its security implications

Question

I'm going through a surge of interest in C99's variably-modified type system. This question was inspired by this one.

Checking the code from this question, I discovered something interesting. Consider this code:

int myFunc(int, int, int, int[][100]);

int myFunc(int a, int b, int c, int d[][200]) {
    /* Some code here... */
}

This obviously won't (and does not) compile. However, this code:

int myFunc(int, int, int, int[][100]);

int myFunc(int a, int b, int c, int d[][c]) {
    /* Some code here... */
}

compiles without even a warning (on gcc).

That seems to imply that a variably-modified array type is compatible with any non-variably-modified array type!

But that's not all. You'd expect a variably-modified type to at least bother with which variable is used to set its size. But it doesn't seem to do so!

int myFunc(int, int b, int, int[][b]);

int myFunc(int a, int b, int c, int d[][c]) {
    return 0;
}

Also compiles without any error.

So, my question is: is this correct standardized behaviour?

Also, if a variably-modified array type would really be compatible with any array that has the same dimensions, wouldn't this mean nasty security problems? For example, consider the following code:

int myFunc(int a, int b, int c, int d[][c]) {
    printf("%d\n", sizeof(*d) / sizeof((*d)[0]));
    return 0;
}

int main(){
    int arr[10] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
    myFunc(0, 0, 100, &arr);

    return 0;
}

Compiles and outputs 100, no errors or warnings, nothing. As I see it, that means easy out-of-bounds array write even if you are strictly checking the size of your array via sizeof, not doing a single cast and even have all warnings turned on! Or am I missing something?

If you haven't already, try adding -std=c99 -pedantic-errors to your gcc compile line and see if that makes any difference. — jschultz410, Feb 18 '15 at 18:46
@jschultz410: good idea, but no-it makes no difference at all =( — , Feb 18 '15 at 18:48
There are many instances where it would be impossible for the compiler to statically deduce the value of c (e.g. - c is input from stdin). Therefore, it would often be impossible to do any kind of meaningful static type checking on such a function definition's parameters. It seems if you do this, then the compiler is saying "OK, I'll allow you to pass whatever you want as d, so long as its type is a doubly indexed array of ints. Good luck!" — jschultz410, Feb 18 '15 at 18:53
In such a function, what happens for different invocations with different values of c that advance d??? Does it do the right thing by dynamically figuring out how far it should advance in memory based on c? — jschultz410, Feb 18 '15 at 18:56
@jschultz410: I'm not sure I understand what you mean... Can you give an example? — , Feb 18 '15 at 19:00

score 4 · Accepted Answer · answered Feb 18 '15 at 19:30

C99, section 6.7.5.2 seems to be where the relevant rules are given. In particular,

Line 6:

For two array types to be compatible, both shall have compatible element types, and if both size specifiers are present, and are integer constant expressions, then both size specifiers shall have the same constant value. If the two array types are used in a context which requires them to be compatible, it is undefined behavior if the two size specifiers evaluate to unequal values.

A previous, now-deleted answer also referenced line 6. Commentary on that answer argued that the second sentence was subject to the condition at the end of the first, but that seems an unlikely reading. Example 3 of that section may clarify (excerpt):

int c[n][n][6][m];
int (*r)[n][n][n+1];
r=c;   // compatible, but defined behavior only if
       // n == 6 and m == n+1

That seems comparable to the example in the question: two array types, one having a constant dimension and the other having a corresponding variable dimension, and required to be compatible. Behavior is undefined (per comment in example 3 and one reasonable reading of 6.7.5.2/6) when at runtime the variable dimension differs from the compile-time constant dimension. And isn't undefined behavior what you would expect anyway? Else why raise the question?

Supposing we can agree that behavior is undefined when such a mismatch occurs, I observe that compilers are in general not required to recognize undefined or possibly-undefined behavior, nor to issue any kind of diagnostic whatsoever if they do recognize such. I'd hope in this case that the compiler would be capable of warning about the possibly-undefined behavior, but it must successfully compile the code because it is syntactically correct and satisfies all applicable constraints. Note that a compiler capable of warning about such uses might not do so by default.

Thanks, that's a great answer! I think you're right, my reading of this rule was probably incorrect. I was probably too used to type incompatibility being UB cause Number One... That leaves the C standard in the clear, but not gcc =) anyway, ranting about it not giving warnings will probably not do much good... — , Feb 18 '15 at 19:39
And if it was up to me, I'd explicitly declare VM types incompatible to everything just to be sure =) Makes sense, IMO. — , Feb 18 '15 at 19:48
But @Mints, VM types have to be compatible at least with themselves, else they would be unusable. You can still have undefined behavior with VM types having exactly the same declarator, however, when at runtime the variable dimensions differ. Given that issue, what's to be gained by making VM types automatically incompatible with everything else? This is another example of C giving the programmer powerful tools, with which he can wreak powerful damage. C is not for weenies. (Not implying anything there.) — John Bollinger, Feb 18 '15 at 20:16

jschultz410 · Answer 2 · 2015-02-18T19:30:28.687

#include <stdio.h>

void foo(int c, char d[][c])
{
  fprintf(stdout, "c = %d; d = %p; d + 1 = %p\n", c, d, d + 1);
}

int main()
{
  char x[2][4];
  char y[3][16];
  char (*z)[4] = y;  /* Warning: incompatible types */

  foo(4, x);
  foo(16, y);
  foo(16, x);        /* We are lying about x. What can / should the compiler / code do? */
  foo(4, y);         /* We are lying about y. What can / should the compiler / code do? */

  return 0;
}

Outputs:

c = 4; d = 0x7fff5b295b70; d + 1 = 0x7fff5b295b74
c = 16; d = 0x7fff5b295b40; d + 1 = 0x7fff5b295b50
c = 16; d = 0x7fff5b295b70; d + 1 = 0x7fff5b295b80
c = 4; d = 0x7fff5b295b40; d + 1 = 0x7fff5b295b44

So, foo() does dynamically figure out how far to advance d based on c, as your code also demonstrates.

However, it is often impossible for the compiler to statically determine if/when you call foo() incorrectly. It seems that if you do this, then the compiler is saying "OK, I'll allow you to pass whatever you want as d, so long as its type is a doubly indexed array of chars. Operations on the pointer d will be determined by c. Good luck!"

That is, yes, the compiler often can't do static type checking on these kinds of parameters and so the standard almost certainly does not mandate compilers to catch all cases where it is possible to statically determine a type incompatibility.

Yes, as `sizeof(*d)` is calculated using `c`, as it is shown in my question =) — , Feb 18 '15 at 19:06
"However, it is typically impossible for the compiler to determine if/when you call foo() incorrectly". That is exactly the problem. Why am I even allowed to call `foo` if this completely kills static type-checking? Also, dynamic type-checking could be used. A VLA implementation would have to keep its size as some form of metadata anyway for dynamic `sizeof` calls, so why not, say, segfault or go UB if this size doesn't match the size of the type of the parameter it was passed as? — , Feb 18 '15 at 19:26
@Mints97: C does not require dynamic type checking. If the size doesn't match, the behavior is undefined; it *does*, as you say, "go UB". Undefined behavior doesn't mean your program will crash. It means the behavior is undefined. That often includes the program appearing to "work". — Keith Thompson, Feb 18 '15 at 19:36

Variably-modified types compatibility and its security implications

2 Answers2