-4

I'm updating my question, very sorry for asking it the wrong way.

Now I could distill my problem into a single self-contained piece of code:

#include <stdio.h>
#include <stdlib.h>

static __inline__ char* fileRead(char* file){
     FILE* fp;
     long fileSize;
     char* fileContents;

     fp = fopen ( file , "rb" );
     if(!fp){
          perror(file);
          exit(1);}

     /* this block writes the size of the file in fileSize */
     fseek( fp , 0L , SEEK_END);
     fileSize = ftell( fp );
     rewind( fp );

     /* allocate memory for entire content */
     fileContents = malloc(fileSize+1);
     if(!fileContents){
          fclose(fp);
          fputs("memory alloc fails",stderr);
          exit(1);}

     /* copy the file into the buffer */
     if(fread(fileContents, fileSize, 1, fp) != 1){
          fclose(fp);
          free(fileContents);
          fputs("entire read fails",stderr);
          exit(1);}

     /* close the file */
     fclose(fp);
     return fileContents;}

int main (){
     char* head10 = "";
     char* fileName = "testhtml.html";
     FILE* out = fopen(fileName, "w");

     head10 = fileRead("head10.html");
          printf("%s\n", head10);

     out = fopen(fileName, "wb");
          fprintf(out, "%s\n", head10);
          fclose(out);

     free(head10);
return 0;}

Here the head10.html file.

I'm compiling it with -fsanitize=address, and I'm getting an heap-buffer-overflow. The error seems to be caused at the line fprintf(out, "%s\n", head10);. head10 is the only malloc'd variable, so that makes sense.

I can print it without problems with printf, but when I try to write it to file with fprintf, an heap-buffer-overflow is generated.

===EDIT=== Looks like the problem came from using fprintf with a malloc'd var, as fprintf itself uses malloc under the hood, so the original alloc gets lost, and memory leaks.

So i rewrote my functions without using malloc:

#define _POSIX_C_SOURCE 200809L /* for getline() */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

static __inline__ void fileReset(char* fileName){
     FILE* out = fopen(fileName, "w");
     fwrite("" , sizeof(char) , strlen("") , out );
     fclose(out);}

static __inline__ void fileAppend(char* fileName, char* string){
     FILE* out = fopen(fileName, "a"); /* using "a" to APPEND */
     if(fwrite(string , sizeof(char) , strlen(string) , out ) != strlen(string)){
               printf("==file write error\n");
               exit(EXIT_FAILURE);}
     fclose(out);}

static __inline__ void fileAppendFile(char* source, char* dest){
     FILE* in = fopen(source, "r");
     char *line = NULL;
    size_t len = 0;
    size_t read;

     while ((read = getline(&line, &len, in)) != -1) {
          fileAppend(dest, line);}

        free(line);
        fclose(in);}

int main (){
     char* fileName = "testhtml.html";
     char* theme = "dark";

     fileReset(fileName);
     fileAppendFile("head10.html", fileName);
     fileAppend(fileName, theme);
return 0;}

Thanks a lot for all the help, very noob here, didn't know what -lasan was, now I know what an invaluable tool!

==EDIT-2== As pointed out by EmployedRussian, the problem in the original code was NOT fprintf, but the lack of a terminating '\0', look at their answer below, it does fix my original code :)

nff
  • 33
  • 5
  • 1
    I doubt anyone here will debug your program for you. Does it happen in a debug build? If yes, did you try running it under a debugger? Also try `-fsanitize=address` (on Linux, it doesn't work on Windows). – HolyBlackCat Jun 06 '20 at 21:58
  • @HolyBlackCat the program runs fine on Linux. It's my first post here, can you suggest how can I make it easier for people to have a look at the project and help? – nff Jun 06 '20 at 22:10
  • I am missing something here. What html file is less than 9 characters? Even `` is more than that. The usual reason (apart from files not existing) for this sympton is the code contains *undefined behaviour* somewhere, which means that it might or might not work. Concatenating a string is a favourite place for this to happen. – Weather Vane Jun 06 '20 at 22:15
  • 1
    @nff [Questions seeking debugging help](https://stackoverflow.com/help/on-topic) are required to contain the [minimum code necessary to reproduce the problem](https://stackoverflow.com/help/minimal-reproducible-example) in the question itself. I understand that you attempted to shrink the code to focus on the problem, but if the code still isn't small enough to post here, then I'm afraid that the question just isn't a good fit for stack overflow. – user3386109 Jun 06 '20 at 22:22
  • 1
    *"the program runs fine on Linux"* If you don't see any problems, it doesn't mean there are none. Some UB can make your code crash on Windows, and not (yet) crash on Linux. Run with the sanitizer and see if it reports anything. – HolyBlackCat Jun 06 '20 at 22:24
  • @HolyBlackCat uhhh ok, just ran it with the sanitizer on linux, got a bunch of memleaks, I'm going to have a look, thanks! – nff Jun 06 '20 at 22:48
  • @WeatherVane my html files are actually html segments, but besides of that, I tried shinking that one to see if I was somehow exceding some array size, so I got to 9chars, no error. But yeah, that is not the problem, I have indeed LOTS of string concatenations, I'll investigate, thanks! – nff Jun 06 '20 at 22:51
  • Please don't present status codes as negative values when seeking help. If a program or shell gives a status code to you as a negative decimal value, convert it to a positive hexadecimal value. "-1073741819" means nothing to me until I convert it to 0xC0000005, which is instantly recognized as [`STATUS_ACCESS_VIOLATION`](https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-erref/596a1078-e883-4972-9bbc-49e60bebca55) because it's such a common error. – Eryk Sun Jun 07 '20 at 14:17
  • I have uploaded your code to our cloud IDE, here is the [link](https://cee.studio/?bucket=200608-PfB&name=FV3EM), but I don't have the data files to reproduce this problem. You can use the link and follow this [instruction](https://www.cee.studio/segfault.html) to debug this memory error. – stensal Jun 08 '20 at 04:09
  • thanks @stensal, I will add my head10.html data file and debug the memory error. – nff Jun 08 '20 at 09:07
  • let me know if you need any help. It should report the exact cause. – stensal Jun 08 '20 at 14:34
  • @stensal ok, the problem is fprintf does use a form of malloc under the hood, so the actual allocation of "head10" in my program is lost. In the end I rewrote my fileRead and fileWrite functions without using malloc. If it's appropriate I can paste the new version here. Thanks for all the help. – nff Jun 08 '20 at 16:14
  • Yes, you can paste the new version, please make sure your code can compile. – stensal Jun 08 '20 at 18:20
  • ok, @stensal, thanks, here is the working code, compiled and ran with libasan, no errors/leaks reports. Compiles and runs as expected on Linux and Win10. I'll edit my question with the new code. – nff Jun 08 '20 at 22:23
  • Can you please check if the answer fixed your code? if not, can you please also post the data file? You can use pastebin.com to create a link with your data. – stensal Jun 09 '20 at 05:39
  • @stensal both my new code and the fix suggested by EmployedRussian are solving the problem. I marked it as accepted answer. Also, I'll add a pastebin of the html to my original question, for reference. – nff Jun 09 '20 at 10:01

1 Answers1

1

Looks like the problem came from using fprintf with a malloc'd var, as fprintf itself uses malloc under the hood, so the original alloc gets lost, and memory leaks.

I am afraid you learned the wrong lesson here.

While fprintf may indeed use malloc under the hood, your problem doesn't have anything to do with that.

I created a head10.html file containing abc\n (4 characters). Running your program with that input file produced:

==10173==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000000015 at pc 0x7fb5db2c7054 bp 0x7ffd44e74de0 sp 0x7ffd44e74590
READ of size 6 at 0x602000000015 thread T0
    #0 0x7fb5db2c7053  (/usr/lib/x86_64-linux-gnu/libasan.so.5+0x4d053)
    #1 0x5654101dd435 in main /tmp/foo.c:43
    #2 0x7fb5db0dde0a in __libc_start_main ../csu/libc-start.c:308
    #3 0x5654101dd199 in _start (/tmp/a.out+0x1199)

0x602000000015 is located 0 bytes to the right of 5-byte region [0x602000000010,0x602000000015)
allocated by thread T0 here:
    #0 0x7fb5db381628 in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0x107628)
    #1 0x5654101dd2db in fileRead /tmp/foo.c:20
    #2 0x5654101dd425 in main /tmp/foo.c:42
    #3 0x7fb5db0dde0a in __libc_start_main ../csu/libc-start.c:308

So the problem is that you allocated 5 bytes (as expected), but fprintf tried to read 6th character from that buffer.

Why would it do that? Because the format you used: %s expects to find a terminating NUL character (i.e. it expects a properly terminated C-string), and you gave it a pointer to non-terminated string with the following bytes:

a b c \n X

What value does the fifth byte contain? It's undefined (it came from malloc, and no value was written into it). Since that value is not NUL, fprintf tries to read the next (6th) byte, and that's when Address Sanitizer signals the error and aborts your program.

The correct fix is to NUL-terminate the string, like so:

 if (fread(fileContents, fileSize, 1, fp) != 1){ ... handle error
 fileContents[fileSize] = '\0';  // NUL-terminate the string.
Employed Russian
  • 199,314
  • 34
  • 295
  • 362
  • oh wow. wow. That's amazing! Thanks a lot for taking the time. Of course, "fileContents" is NOT a 0-terminated string, it's just a raw file-read! I think I always assumed it was 0-terminated, or not even think about it. Tried myself, it compiles and runs as expected, libasan doesn't raise any error. Thanks! – nff Jun 09 '20 at 09:56