0

I have tried solving an exercise where we have to return a struct containing the first whitespace-separated word and its length of a given string. Example: "Test string" returns {"Test", 4}.

To solve this problem I have implemented the following function:

struct string whitespace(char* s){
    char* t = s;
    size_t len = 0;
    while(*t != ' '){
        len++;
        t++;
    }
    char out[len+1];
    strncpy(out, s, len);
    if(len>0){
        out[len] = '\0';
    }
    //printf("%d\n",len);
    struct string x = {out, len};
    return x;
}

with the struct defined as follows:

struct string{
    char* str;
    size_t len;
};

If I run the following main function:

int main(){
    char* s = "Test string";
    struct string x = whitespace(s);
    printf("(%s, %d)\n", x.str, x.len);
    return 0;
}

I get this output:

(, 4)

where when I remove the comment //printf("%d\n",len); I get:

4
(Test, 4)

In fact, the string (Test, 4) is output whenever I print out a given variable in the function whitespace(char* s). Also when using different gcc optimization flags such as -O3 or -Ofast the result is correct even without the printing of the variables in the function.

Did I bump into some kind of undefined behavior? Can somebody explain what is happening here?

pr0f3ss
  • 527
  • 1
  • 4
  • 17

1 Answers1

2

The struct you're returning includes a char *, which you point to the local variable out. That variable goes out of scope when the function returns, so dereferencing that pointer invokes undefined behavior.

Rather than using a VLA, declare out as a pointer and allocate memory for it to point to. Then you can safely set the struct member to that address and the memory will be good for the duration of the program.

char *out = malloc(len+1);

Also, be sure to free this memory before exiting your program.

dbush
  • 205,898
  • 23
  • 218
  • 273