0

This subprogram takes three user inputs: a text string, a path to a file, and a 1 digit flag. It loads the file into a buffer, then appends both the flag and the file buffer, in that order, to a char array that serves as a payload. It returns the payload and the original user string.

I received a bug where some of my string operations on the file buffer, flag, and payload appeared to corrupt the memory that the user_string was located in. I fixed the bug by swapping strcat(flag, buffer) to strcpy(payload, flag), (which is what I intended to write originally), but I'm still perplexed as to what caused this bug.

My guess from reading the documentation (https://www.gnu.org/software/libc/manual/html_node/Concatenating-Strings.html , https://www.gnu.org/software/libc/manual/html_node/Concatenating-Strings.html) is that strcat extends the to string strlen(to) bytes into unprotected memory, which the file contents loaded into the buffer copied over in a buffer overflow.

My questions are:

  1. Is my guess correct?

  2. Is there a way to reliably prevent this from occurring? Catching this sort of thing with an if(){} check is kind of unreliable, as it doesn't consistently return something obviously wrong; you expect a string of length filelength+1 and get a string of filelength+1.

  3. bonus/unrelated: is there any computational cost/drawbacks/effects with calling a variable without operating on it?

/*
user inputs:
argv[0] = tendigitaa/four
argv[1] = ~/Desktop/helloworld.txt
argv[2] = 1

helloworld.txt is a text file containing (no quotes) : "Hello World"
*/
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <unistd.h>
#include <string.h>

int main (int argc, char **argv) {

    char user_string[100] = "0";
    char file_path[100] = "0";
    char flag[1] = "0";

    strcpy(user_string, argv[1]);
    strcpy(file_path, argv[2]);
    strcpy(flag, argv[3]);

    /*
    at this point printfs of the three declared variables return the same as the user inputs.

    ======
    ======
    a bunch of other stuff happens...
    ======
    ======
    and then this point printfs of the three declared variables return the same as the user inputs.
    */

    FILE *file;
    char * buffer = 0;
    long filelength;

    file = fopen(file_path, "r");

    if (file) {
        fseek(file, 0, SEEK_END);
        filelength = ftell(file);
        fseek(file, 0, SEEK_SET);
        buffer = malloc(filelength);
        printf("stringcheck1: %s \n", user_string);
        if (buffer) {
            fread(buffer, 1, filelength, file);
        }
    }

    long payloadlen = filelength + 1;
    char payload[payloadlen];
    printf("stringcheck2: %s \n", user_string);
    strcpy(payload, flag);
    printf("stringcheck3: %s \n", user_string);
    strcat(flag, buffer);
    printf("stringcheck4: %s \n", user_string); //bug here
    free(buffer);
    printf("stringcheck5: %s \n", user_string);

    payload; user_string; //bonus question: does this line have any effect on the program or computational cost?

    return 0;
}

/*
printf output:

stringcheck1: tendigitaa/four
stringcheck2: tendigitaa/four
stringcheck3: tendigitaa/four
stringcheck4: lo World
stringcheck5: lo World
*/

note: taking this section out of the main program caused stringcheck 4 to segfault instead of returning "lo World". The behavior was otherwise equivalent.

chqrlie
  • 131,814
  • 10
  • 121
  • 189
garter_snake
  • 11
  • 1
  • 2

3 Answers3

2

strcat does exactly what documentation says:

char *strcat(char *restrict s1, const char *restrict s2); The strcat() function shall append a copy of the string pointed to by s2 (including the terminating null byte) to the end of the string pointed to by s1. The initial byte of s2 overwrites the null byte at the end of s1. If copying takes place between objects that overlap, the behavior is undefined.

s1 has to have enough memory allocated to accommodate both strings plus the terminating nul

The linked article is about programming own string concatenating functions. How to write such a function depends on the application - which is stated there. There are many ways.

In your program the destination char array is not big enough and the result is an Undefined Behaviour and it is not even big enough to accommodate a single character string.

I strongly advice to learn some C strings basics.

If you want safer strcat you can write your own one for example:

char *mystrcat(const char *str1, const char *str2)
{
    char *dest = NULL;
    size_t str1_length, str2_length;

    if(str1 && str2)
    {
        dest = malloc((str1_length = strlen(str1)) + (str2_length = strlen(str2)) + 1);
        if(dest)
        {
            memcpy(dest, str1, str1_length);
            memcpy(dest + str1_length, str2, str2_length);
        }
    }
    return dest;
}

But for the safety we always pay the price - the code is longer and less efficient. C language was designed to be as efficient as possible sacrificing the safety and introducing the idea if the Undefined Behaviour.

0___________
  • 60,014
  • 4
  • 34
  • 74
1

You can't store a non-empty string in a 1-character array. A string needs room for the string contents and a null terminator.

So when you declare

char flag[1] = "1";

you've only allocated one byte, which contains the character 1. There's no null terminator.

Using this with any string functions will result in undefined behavior, because they look for the null terminator to find the end of the string.

strcat(flag, buffer) will search for the null terminator, which will be outside the array, and then append buffer after that. So this clearly causes a buffer overflow when writing.

strcpy(payload, flag) is also wrong. It will look for a null terminator after the flag bytes to know when to stop copying to payload, so it will copy more than just flag (unless there happens to be a null byte after it).

You can resolve the strcpy() problem by increasing the size:

char flag[2] = "1";

You can also leave the size empty, the compiler will make it large enough to hold the string that initializes it, including the null byte:

char flag[] = "1";
Barmar
  • 741,623
  • 53
  • 500
  • 612
0

The line that causes the problem is because strcat() is trying to cram buffer into flag which is only one character long and you haven't allocated any more space to fit buffer.

If you want to put buffer into flag, I recommend using realloc() to increase the length of flag to include the length of buffer.

Also the only thing you ever print is user_string. I'm not sure if you're trying to print the other string you're working with.

  • 1
    You can't use `realloc()` on an array, it has to be a pointer. – Barmar Jun 12 '20 at 20:49
  • As pointed out in Barmar's answer, the buffer `flag` is not even large enough to store a string with a single character. Due to this, the contents of `flag` is not null-terminated. Therefore, the problem is not so much that `flag` needs more space to fit `buffer`, but rather that it is not a string in the first place (because strings must be null-terminated). For this reason, even if `buffer` were an empty string, the `strcat` operation would cause [undefined behavior](https://en.wikipedia.org/wiki/Undefined_behavior). – Andreas Wenzel Jun 12 '20 at 20:53