This subprogram takes three user inputs: a text string, a path to a file, and a 1 digit flag. It loads the file into a buffer, then appends both the flag and the file buffer, in that order, to a char array that serves as a payload. It returns the payload and the original user string.
I received a bug where some of my string operations on the file buffer, flag, and payload appeared to corrupt the memory that the user_string was located in. I fixed the bug by swapping strcat(flag, buffer)
to strcpy(payload, flag)
, (which is what I intended to write originally), but I'm still perplexed as to what caused this bug.
My guess from reading the documentation (https://www.gnu.org/software/libc/manual/html_node/Concatenating-Strings.html , https://www.gnu.org/software/libc/manual/html_node/Concatenating-Strings.html) is that strcat
extends the to
string strlen(to)
bytes into unprotected memory, which the file contents loaded into the buffer copied over in a buffer overflow.
My questions are:
Is my guess correct?
Is there a way to reliably prevent this from occurring? Catching this sort of thing with an
if(){}
check is kind of unreliable, as it doesn't consistently return something obviously wrong; you expect a string of lengthfilelength+1
and get a string offilelength+1
.bonus/unrelated: is there any computational cost/drawbacks/effects with calling a variable without operating on it?
/*
user inputs:
argv[0] = tendigitaa/four
argv[1] = ~/Desktop/helloworld.txt
argv[2] = 1
helloworld.txt is a text file containing (no quotes) : "Hello World"
*/
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <unistd.h>
#include <string.h>
int main (int argc, char **argv) {
char user_string[100] = "0";
char file_path[100] = "0";
char flag[1] = "0";
strcpy(user_string, argv[1]);
strcpy(file_path, argv[2]);
strcpy(flag, argv[3]);
/*
at this point printfs of the three declared variables return the same as the user inputs.
======
======
a bunch of other stuff happens...
======
======
and then this point printfs of the three declared variables return the same as the user inputs.
*/
FILE *file;
char * buffer = 0;
long filelength;
file = fopen(file_path, "r");
if (file) {
fseek(file, 0, SEEK_END);
filelength = ftell(file);
fseek(file, 0, SEEK_SET);
buffer = malloc(filelength);
printf("stringcheck1: %s \n", user_string);
if (buffer) {
fread(buffer, 1, filelength, file);
}
}
long payloadlen = filelength + 1;
char payload[payloadlen];
printf("stringcheck2: %s \n", user_string);
strcpy(payload, flag);
printf("stringcheck3: %s \n", user_string);
strcat(flag, buffer);
printf("stringcheck4: %s \n", user_string); //bug here
free(buffer);
printf("stringcheck5: %s \n", user_string);
payload; user_string; //bonus question: does this line have any effect on the program or computational cost?
return 0;
}
/*
printf output:
stringcheck1: tendigitaa/four
stringcheck2: tendigitaa/four
stringcheck3: tendigitaa/four
stringcheck4: lo World
stringcheck5: lo World
*/
note: taking this section out of the main program caused stringcheck
4 to segfault instead of returning "lo World". The behavior was otherwise equivalent.