6

When I teach C, sometimes I count on GCC to do the "convincing" part of some rules. For example, one should not consider that a local variable on a function retains the value between calls.

GCC always helped me to teach these lessons to students, putting garbage on local variables, so they understand what is happening.

Now, this piece of code is definitely causing me a hard time.

#include <stdio.h>

int test(int x)
{
        int y;
        if(!x)
                y=0;
        y++;
        printf("(y=%d, ", y);
        return y;
}

int main(void)
{
        int a, i;

        for(i=0; i<5; i++)
        {
                a=test(i);
                printf("a=%d), ", a);
        }
        printf("\n");
        return 0;
}

The output is:

(y=1, a=1), (y=2, a=2), (y=3, a=3), (y=4, a=4), (y=5, a=5),

But if I comment the line:

       /* printf("(y=%d, ", y); */

Then the output become:

a=1), a=32720), a=32721), a=32722), a=32723),

I compile the code using -Wall switch, but no warnings are related to the use of local variables without initializing them.

Is there any GCC switch to cause a warning, or at least to explicit show some garbage? I tried the optimization switches, and that helped as the code output became like this:

$ gcc test.c -o test -Wall -Os
$ ./test 
(y=1, a=1), (y=1, a=1), (y=1, a=1), (y=1, a=1), (y=1, a=1),
$ gcc test.c -o test -Wall -Ofast
$ ./test 
(y=1, a=1), (y=1, a=1), (y=1, a=1), (y=1, a=1), (y=1, a=1),
$ gcc test.c -o test -Wall -O0
$ ./test 
(y=1, a=1), (y=2, a=2), (y=3, a=3), (y=4, a=4), (y=5, a=5),
$ gcc test.c -o test -Wall -O1
$ ./test 
(y=1, a=1), (y=1, a=1), (y=1, a=1), (y=1, a=1), (y=1, a=1),
$ gcc test.c -o test -Wall -O2
$ ./test 
(y=1, a=1), (y=1, a=1), (y=1, a=1), (y=1, a=1), (y=1, a=1),
$ gcc test.c -o test -Wall -O3
$ ./test 
(y=1, a=1), (y=1, a=1), (y=1, a=1), (y=1, a=1), (y=1, a=1),

But y=1 in all cases is kind of trick. Does the standard changed so the local variables are now initialized with zeros?

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
DrBeco
  • 11,237
  • 9
  • 59
  • 76
  • I don't think C specifies a standard for intializing variables. In fact, I think the memory stays the way it is, so when you add to y, the value at y was the same as it was before, and then you increment it. – KrisSodroski Jun 13 '13 at 18:35
  • The coincidence is that the second call to the function is using the same block code of RAM. – DrBeco Jun 13 '13 at 18:37
  • 1
    I don't think its a coincidence, since the compiler will link y to a certain region of memory that is relative. It doesn't change y's relation to the first stack, so if the previous stack doesn't grow in size, y will always be at the same place! – KrisSodroski Jun 13 '13 at 18:38
  • The only way to cause y to change is using some `malloc` in function main? – DrBeco Jun 13 '13 at 18:41
  • 2
    @Magn3s1um: you are assuming `y` is on the stack. What if it's in a register instead? Undefined behaviour is undefined behaviour. You can't count on the results at all, even if you think you know why it might do what you want anyway. – Celada Jun 13 '13 at 18:41
  • 1
    Relying on the compiler to show obvious error behavior for programs with undefined behavior is a highly risky proposition. Especially when it's about uninitialized variables and use-after-free errors. – Sebastian Redl Jun 13 '13 at 18:43
  • To modify the stack on the fly you could alloca(). Then you will get different results assuming the variable is on the stack, because the stack layout changes. @SebastianRedl, I think for demonstrating the effects of an undefined behaviour that's ok IMO. – Devolus Jun 13 '13 at 18:44
  • While y could be in a register, in this case, its obviously not since the memory stays the same until you increment it. Hence, y is mapped to a region of memory. You are correct though. Just check out the object file, and see if y gets leal'd or loaded. – KrisSodroski Jun 13 '13 at 18:45
  • 1
    @Devolus I disagree, actually. I have seen lots and lots of people say something to the point of, "but it worked when I tried it before" when confronted with a program with undefined behavior. I think getting C and C++ programmers into the habit of relying on trying things out to see if they work is a bad idea - these languages just don't work that way. – Sebastian Redl Jun 13 '13 at 18:47
  • 2
    @SebastianRedl, problem is when people are relying on "it works for me" instread of knowing what effects it will have. Many people think that "if it worked before" it was ok, which is not neccessarily true. – Devolus Jun 13 '13 at 18:48
  • `-Wmaybe-uninitialized` gives no warning... – DrBeco Jun 13 '13 at 18:50
  • 1
    @Devolus Making people think that undefined behavior yields predictable effects of any kind is just another problem. OpenSSL tried to generate randomness by intentionally reading an uninitialized local variable - then some compiler came along and decided to just replace that read with a constant 0. Suddenly the randomness was gone. – Sebastian Redl Jun 13 '13 at 18:50
  • @SebastianRedl, thats a perfect example that poeple should NOT rely on undefined behaviour. :) – Devolus Jun 13 '13 at 18:52
  • You should use -Wextra as well as -Wall with GCC. -Wall was all warnings some years (decades?) ago and hasn't kept track with the newer warnings available; presumably because it would cause too much "noise" on older programs and the more important warnings would get missed. – Dipstick Jun 13 '13 at 20:23
  • It would obviously be a massive security hole to allocate memory to a process that has been used by another process without wiping the memory first. For various complicated reasons it is generally easier to wipe to all zeroes. This is obviously a problem for simple module tests where much of the memory is only exercised once with little reuse. There are normally ways, e.g. mallopt() on linux, to get malloc() to set the value of allocated memory to some non-zero value. You can make sure that most of the stack is non-zero by calling a function that declares a large array and fills it. – Dipstick Jun 13 '13 at 21:00

3 Answers3

4

That's the problem with undefined behaviour: it's "undefined".

So, any set of results is entirely down to a combination of compiler/settings/what's in memory/interrupts.

You may chance upon some settings that output what you "expect", to demonstrate the problem - but that's just luck.

What you've discovered is actually more important - that the number of failure modes is wider than you can imagine (although luckily, none has reformatted your hard drive yet), and that the most pernicious and dangerous type of 'undefined behaviour' is that where the behaviour is actually 'as expected' for 99.99% of the time.

It's the 0.01% that gets you.

Roddy
  • 66,617
  • 42
  • 165
  • 277
2

Your program causes undefined behaviour. If you pass a non-zero value to test(), y never gets initialized. printf included or not, you can't rely on the results.

If you want a warning, clang will give you one with -Wsometimes-uninitialized:

example.c:6:12: warning: variable 'y' is used uninitialized whenever 'if'
      condition is false [-Wsometimes-uninitialized]
        if(!x)
           ^~
example.c:8:9: note: uninitialized use occurs here
        y++;
        ^
example.c:6:9: note: remove the 'if' if its condition is always true
        if(!x)
        ^~~~~~
example.c:5:14: note: initialize the variable 'y' to silence this warning
        int y;
             ^
              = 0
1 warning generated.

I tested with a couple of GCC versions I have on-hand, but none of them would produce a warning for me.

Carl Norum
  • 219,201
  • 40
  • 422
  • 469
  • I know. I want to enforce this lecture, using GCC (so students can see by themselves some error code returned by compilation) – DrBeco Jun 13 '13 at 18:36
  • I just changed the code loop to `for(i=1; i<5; i++)` and the result was: `(y=1, a=1), (y=2, a=2), (y=3, a=3), (y=4, a=4),`. This is terrible in a class full of jokers. :) – DrBeco Jun 13 '13 at 18:39
  • I tested with gcc switch (`-Wmaybe-uninitialized`) but no warning. Thank you anyway. Good tip. – DrBeco Jun 13 '13 at 18:49
2

Another possible approach to this idea might be to call another function in between calls to the test function. If the other function uses stack space, then it will likely end up changing the stack values. For example, perhaps add a function like this:

int changeStack( int x )
{
    int y2 = x + 100;
    return y2;
}

And then add a call to it:

    for(i=0; i<10; i++)
    {
            a=test(i);
            printf("a=%d), ", a);
            changeStack( i );
    }

It of course depends on optimization levels, but with the default gcc compile (gcc test.c), I got the following results after changing it to do that:

(y=1, a=1), (y=101, a=101), (y=102, a=102), (y=103, a=103), (y=104, a=104), (y=105, a=105), (y=106, a=106), (y=107, a=107), (y=108, a=108), (y=109, a=109),
Mark Wilkins
  • 40,729
  • 5
  • 57
  • 110
  • This approach may well work given GCC's propensity to store everything in memory if no optimizations are requested. Adding a `-O` flag will almost certainly make it go away, though. – Carl Norum Jun 13 '13 at 18:48
  • 1
    @CarlNorum: Yes, it is definitely a very compiler-specific and optimization-specific demonstration. The use of any optimization does indeed change the result. Tricky business trying to demonstrate undefined behavior. – Mark Wilkins Jun 13 '13 at 18:55
  • Mark, you got the point. I think can use this example in class. Thanks! – DrBeco Jun 13 '13 at 19:02