1

Suppose I have the following functions:

char* allocateMemory() 
{
    char str[20] = "Hello world.";
    return str;
}

int* another()
{
    int x = 5;
    return &x;
}

int _tmain(int argc, _TCHAR* argv[])
{
    char* pString = allocateMemory();
    printf("%s\n", pString);

    int* blah = another();
    printf("%d %d \n", blah, *blah);

    return 0;
}

The first printf prints random values, because str IS LOCAL SCOPE.

The second printf prints the proper values, with blah = address of blah, *blah = 5

Why is it that local scope only affects allocateMemory which deals with arrays, but not integer?

Why does the first printf (returning char* ) prints random values and is affected by local scope, but not the second one (returning int* )?

Mantracker
  • 613
  • 10
  • 22

4 Answers4

5

Both ways of accessing the local variables of a method which goes out of scope is Undefined Behavior. These are some valid ways:

char* allocateMemory() 
{
    char* str= malloc(sizeof(char) * 20); //assuming C
    strcpy(str, "Hello World.");
    return str; //Valid 
}

const char* allocateMemory() 
{
    return "Hello world."; //Valid Hello World is in read only location
}

int* another()
{
    int *x = malloc(sizeof(int)); //assuming C
    *x = 5;
    return x; //Valid
}
Sadique
  • 22,572
  • 7
  • 65
  • 91
  • The second `allocateMemory` must return `char const*` to reflect the fact that it points to *"read only location"* as your comment correctly says! **Compilers don't read comments.**. – Nawaz Feb 02 '15 at 06:45
  • 2
    `Compilers don't read comments.` Loool. Makes sense. – Sadique Feb 02 '15 at 06:47
  • Umm really weird, I ran the code in VS2012 many times, always got the correct returns, which is why I was asking. In general though, the proper way to modify a string for example, is to do allocateMemory(char * blah), and modify blah correct? – Mantracker Feb 02 '15 at 07:31
  • Actually nevermind, you can do what I did, just have to store it on the heap, not on the stack – Mantracker Feb 02 '15 at 07:34
1

Change the first function to:

char* allocateMemory() 
{
    static char str[20] = "Hello world.";
    return str;
}

and see the difference.

And now explanation:

When you return address of local data (variable or array, does not matter - it is AUTOMATIC variables) you have a risk to lose data or make a mess in the memory. It was just a good luck that integer data was correct after the second function call. But if you return address of STATIC variables - no mistakes. Also you can allocate memory from HEAP for data and return address.

bolov
  • 72,283
  • 15
  • 145
  • 224
VolAnd
  • 6,367
  • 3
  • 25
  • 43
1
char str[20] = "Hello world.";

str is local to function allocateMemory() and is no more valid once you exit the function and hence accessing it out of its scope if undefined behavior.

int x = 5;

The same applies here also.

You can have your data on heap and return the pointer to it is valid.

char *allocatememory()
{
   char *p = malloc(20); /* Now the memory allocated is on heap and it is accessible even after the exit of this function */
   return p; 
}
Gopi
  • 19,784
  • 4
  • 24
  • 36
1

These are both, of course, UB, as the other answerers said. They also gave some good ways to do what you want to do in a proper fashion. But you were asking why does this actually happen in your case. To understand it, you need to understand what happens in the stack when you call a function. I'll try to provide a really simplified explanation.

When a function is called, a new stack frame is created on top of the stack. All the data in the function is put onto the stack frame. So, for the function

char* allocateMemory() 
{
    char str[20] = "Hello world.";
    return str;
}

The stack frame for allocateMemory will contain, besides some other stuff, the 20 elements of the string (char array) str.

For this function:

int* another()
{
    int x = 5;
    return &x;
}

The stack frame for another will contain the contents of the variable x.

When a function returns, the stack pointer, which marks the top of the stack, drops all the way down to where it was before a function invocation. However, the memory is still there on the stack, it doesn't get erased - it is a costy and pointless process. However, there is no longer anything protecting this memory from being overwritten by something: it has been marked "unneeded".

Now, what's the difference between your calls to printf? Well, when you call printf, it gets its own stack frame. It overwrites what was left of the previous called function's stack frame.

In the first case, you just pass pString to printf. Then printf overwrites the memory that once was the stack frame of allocateMemory, and the memory that was once str gets covered with stuff printf needs to work with string output, like iteration variables. Then it proceeds to try and get memory pointed to by the pointer you passed to it, pString... But it has just overwritten this memory, so it outputs what looks like garbage to you.

In the second case, you first got the value of the pointer blah, which resides in your local scope. Then you dereferenced it with *blah. Now comes the fun part: you've done the dereferencing before you've called another function which could overwrite the contents of the old stack frame. Which means the memory that was once the variable x in the function another is sort of still there, and by dereferencing the pointer blah, you get the value of x. And then you pass it to printf, but now, it doesn't matter that printf will overwrite another's stack frame: the values you passed to it are now sort of "safe". That's why the second call to printf outputs the values you expect.

I've heard of people who dislike using the heap so much that they use this "trick" in the following way: they form a stack array in a function and return a pointer to it, then, after the function returns, they copy its contents to an array in the caller's scope before calling any other function, and then proceed to use it. Never do this, for the sake of all the people who may read your code.