8

Currently I'm working on a small program that reads large files and sorts them. After some benchmarking I stumbled upon a weird performance issue. When the input file got to large the writing of the output file took longer than the actual sorting. So I went deeper into the code and finally realized that the fputs-function might be the problem. So I wrote this little benchmarking programm.

#include "stdio.h"
#include "ctime"

int main()
{
    int i;
    const int linecount = 50000000;
    //Test Line with 184 byte
    const char* dummyline = "THIS IS A LONG TEST LINE JUST TO SHOW THAT THE WRITER IS GUILTY OF GETTING SLOW AFTER A CERTAIN AMOUNT OF DATA THAT HAS BEEN WRITTEN. hkgjhkdsfjhgk jhksjdhfkjh skdjfhk jshdkfjhksjdhf\r\n";
    clock_t start = clock();
    clock_t last = start;

    FILE* fp1 = fopen("D:\\largeTestFile.txt", "w");
    for(i=0; i<linecount; i++){
        fputs(dummyline, fp1);
        if(i%100000==0){
            printf("%i Lines written.\r", i);
            if(i%1000000 == 0){
                clock_t ms = clock()-last;
                printf("Writting of %i Lines took %i ms\n", i, ms);
                last = clock();
            }
        }
    }
    printf("%i Lines written.\n", i);
    fclose(fp1);
    clock_t ms = clock()-start;
    printf("Writting of %i Lines took %i ms\n", i, ms);

}

When you execute the programm, you can see a clear drop of performance after about 14 to 15 mio lines which is about 2.5GB of data. The writing takes about 3 times as long as before. The threshold of 2GB indicate a 64bit issue, but I haven't found anything about that in the web. I also tested if there is a difference between binary and character-mode (e.g. "wb" and "w"), but there is none. I also tried to preallocate the filesize (to avoid file fragmentation) by seeking to the expected end and writing a zerobyte, but that had also little to no effect.

I'm running a Windows 7 64bit machine but I've tested it on a Windows Server 2008 64bit R1 machine as well. Currently I'm testing on a NTFS filesystem with more than 200GB of free space. My system has 16GB of RAM so that shouldn't be a problem either. The testprogram only uses about 700KB. The page faults, which I suspected earlier, are also very low (~400 page faults during whole runtime).

I know that for such large data the fwrite()-function would suite the task better, but at the moment I'm interested if there is another workaround and why this is happening. Any help would be highly appreciated.

Aceonline
  • 344
  • 2
  • 12
  • 4
    what file system are you writing to ? and how much memory you got into the box you are running this code from ? – m0ntassar Nov 10 '11 at 08:40
  • As I added above the file system is NTFS. But I would really be interested if that code has the same issues on other file systems/OS. – Aceonline Nov 10 '11 at 08:58
  • Did you defragment your disk prior to running the test? Btw: Is there is something like /dev/null on windows? – alk Nov 10 '11 at 09:04
  • yes, the fragmentation shouldn't be the bottleneck. – Aceonline Nov 10 '11 at 09:06

1 Answers1

9

The main reason for all this is a Windows disk cache. Then your program eats all RAM for it, then swapping begins, and thus, slowdowns. To fight these you need to:

1) Open file in commit mode using c flag:

FILE* fp1 = fopen("D:\\largeTestFile.txt", "wc");

2) Periodically write buffer to disk using flush function:

if(i%1000000 == 0)
{
    // write content to disk
    fflush(fp1);

    clock_t ms = clock()-last;
    printf("Writting of %i Lines took %i ms\n", i, ms);
    last = clock();
}

This way you will use reasonable amount of disk cache. Speed will be basically limited by the speed of your hard drive.

Petr Abdulin
  • 33,883
  • 9
  • 62
  • 96
  • Sounds reasonable. I also suspected something like that. But I've tried your solution and that makes the performance even worse. Now the performance is the same for the first 14 Mio Lines. – Aceonline Nov 10 '11 at 09:25
  • Perfomance is not worse, it's limited by the speed of your hard drive. In your original source code, your data was never actually written to a file (so the final last write will take quite some time). Your "first 14 Mio line" speed is also limited by your hard drive since swapping is used. – Petr Abdulin Nov 10 '11 at 09:28
  • ok, I understand what your saying. But fact is that it took 100 sec before I included the fflush and now it takes 156 sec. – Aceonline Nov 10 '11 at 09:42
  • So adding more RAM would help as the cache could grow larger? Or is the cache limited to 2GBytes? – alk Nov 10 '11 at 09:52
  • Now, I can only assume, but I think, thats because Windows writes data from cache to file in background, after you close the file. If you insert `fflush(fp1);` just before closing file handle, you will get the same time in both cases. Also be sure to turn off any antivirus software. – Petr Abdulin Nov 10 '11 at 10:00
  • @alk yes, cache is not limited to 2 Gb. But Windows manages disk cache size on it's own, it can't be adjusted to some limits, AFAIK. So it's not necessarily true that it will you additional RAM for cache. Cache size management is also different on desktop and server Windows. – Petr Abdulin Nov 10 '11 at 10:05
  • @Petr Abdulin: I think you're right. The used time is roughly the same, when I add a fflush before the fclose (in "wc"-Mode). I will try to increase the disk cache and see what happens. – Aceonline Nov 10 '11 at 10:26
  • I also tried the program on a SSD Drive on a Windows Server 2008 R1 machine. The problem doesn't show up here. I assume that the writing process is faster so the cache won't fill up as fast. Or the caching behaviour of the Server is different. – Aceonline Nov 10 '11 at 10:27
  • @user1039287 I've tested your program on Server 2008 R2 x64 (6 Gb RAM), and I have same symptoms (the only difference is number then slowdown starts), so Server is not free from it. A also monitored memory usage, which led me to my answer. SSD must be much faster, so the problem is not so obvious. – Petr Abdulin Nov 10 '11 at 11:38
  • I have tried to increase the disc cache to the maximum with the tool AnalogX CacheBooster, but there seems to be a limit of 2GB (at least in the tool). The increase to 2GB also didn't change the performance. That seems to correlate with my experiences of the 2,5GB threshold (the ~500MB are written while the cache fills up). But I haven't found anything on a 2GB limit of the Windows disc cache on the web. Strange oO – Aceonline Nov 15 '11 at 09:12
  • I have tried the testprogramm on several other systems and configurations now. With a Windows Server 2003 the behaviour seems totally different. It seems that the caching strategy has changed. On the Winows Server 2003, the programm takes long pauses in an intervall of a few million lines. If I add the flush as suggested by Petr Abdulin, the writing rate seems stable, but also very slow. Since it's a 32bit system its performance for such a large file is very poor. – Aceonline Nov 22 '11 at 13:45