4

My web service writes several thousands of transactions per minute and we save them on the hd.

I was testing different ways to save these files and I made some tests with standard IO and with MemoryMapped files. In my results, writing files (20 k text files) with MemoryMapped files is about 4x faster than standard IO and I was not able to find any disadvantages.

As I have not so much experience with this technology, do you think I may face any problem using them or you don't see any disadvantage?

Thanks!

EDIT 1, here the source:

namespace FileWritingTests.Writers {
    public class MappedFileWriter : ITestWriter {
        public void Write(string content, string path,string id) {
            Byte[] data = System.Text.Encoding.UTF8.GetBytes(content);

            using (var fileStream = new FileStream(path, FileMode.Create, FileAccess.ReadWrite, FileShare.None))

            using (MemoryMappedFile memoryMapped = MemoryMappedFile.CreateFromFile(fileStream, id, data.Count(),
                MemoryMappedFileAccess.ReadWrite, new MemoryMappedFileSecurity(), HandleInheritability.Inheritable, true)) {
                var viewStream = memoryMapped.CreateViewStream();
                viewStream.Write(data, 0, data.Length);                       
            }
        }
    }
}

and this is the tester:

  public TimeSpan Run(int iterations, Writers.ITestWriter tester, String subfolder) {
            Console.WriteLine(" -- Starting test {0} with {1} file writings",subfolder, iterations.ToString());

            Stopwatch stopWatch = new Stopwatch();
            stopWatch.Reset();
            stopWatch.Start();
            for (int i = 1; i <= iterations; i++) {
                tester.Write(transaction, this.path + "\\" + subfolder + "\\" + "file_" + i.ToString() + ".txt", i.ToString());
            }
            stopWatch.Stop();
            TimeSpan ts = stopWatch.Elapsed;

            Console.WriteLine(" -- finish test {0} with {1} file writings. Time Elapsed: {2}", subfolder, iterations.ToString(), ts.TotalMilliseconds);
            return ts;
        }

the tester is called several times and there are several types of tester called for comparison.

Rafa
  • 2,328
  • 3
  • 27
  • 44
  • I would worry about what happens to the contents of the file if the program or the computer crashes. – AdrianHHH Mar 25 '14 at 10:38
  • Just an idea/hint/thought: I recently ran some tests on MMF but besides the crash problem I disliked the idea of presetting the size of the File upfront (Hans wrote that it's not required in every way - I guess I never came to that point). In the end I ended up using a PersistentDictionary backed by ESENT database and a fast Serialization (protobuf). Combining the two, the application is now able to deal with 20000+ records per second synchronously and the speed came pretty close to MMF. – Linky Mar 25 '14 at 11:10
  • I already had the issue that the computer was crashing and the file was uncomplete, this is worse in my case. I need a good file or no file. With no file I can rollback previous transactions using the db... but broken file is bad for me. – Rafa Mar 25 '14 at 12:24
  • 1
    Same for me, but in my opinion the chance of file corruption in memory mapped files is higher than with PersistentDictionary/Esent db as this one writes its journal. It may be worth a look: http://managedesent.codeplex.com/ – Linky Mar 25 '14 at 14:10

1 Answers1

17

The main disadvantage of MMFs is that they consume RAM, making the file system cache less effective. Not an issue with such small files.

Another disadvantage, although surely intentional, is that you can no longer measure the cost of writing the file. That's now a job that's done by the kernel, not your program anymore. It is still being done of course, there's no such thing as a free lunch. It is concurrent with the rest of your program's execution, free threading so to speak. Keep an eye on the CPU utilization of the "System" process in Task Manager. Again, very unlikely to be a problem with such small files.

It is a micro-optimization, blown away by the cost of creating the file. Which hovers between 20 and 50 msec on a spindle disk drive. Don't forget to include that in your measurement. Writing the file data operates at memory bus speeds, upwards of 5 gigabyte/sec depending on the kind of RAM the machine has. What you cut out are just the low-level WriteFile() calls, they are now done by the kernel. You could try testing with a FileStream, use a constructor that takes the bufferSize value. Default is 4096 bytes, increase it to 32K so there's only one WriteFile() call. Chief advantage of doing it this way is that you don't have to guess the size of the MMF up front. It gets very ugly when you guessed too low.

Hans Passant
  • 922,412
  • 146
  • 1,693
  • 2,536
  • Thank you for the hints, I will add to my question the code I'm using for my tests and that I plan to use for my server. – Rafa Mar 25 '14 at 12:27