3

What is the best option for writing (appending) records to file in highly parallel web environment in .net4 IIS7? I use ashx http handler to receive small portions of data that should be written to file quickly. First I used:

    using (var stream = new FileStream(fileName, FileMode.Append, FileAccess.Write, FileShare.ReadWrite, 8192))
    {
        stream.Write(buffer, 0, buffer.Length);
    } 

But I noticed that some records were broken or incomplete, probably because of FileShare.ReadWrite. Next I tried to chage it to FileShare.Read. There where no broken records then, but from time to time I got this exception: System.IO.IOException: The process cannot access the file ... because it is being used by another process.

Ideally I would like the operating system to queue concurrent write requests so that all the records would be written eventually. What file access API should I use?

PanJanek
  • 6,593
  • 2
  • 34
  • 41

2 Answers2

3

there are two options, depending on the size. If the size is small, probably the best option is to simply synchronize access to the file by some shared lock. If possible, it would also be a good idea to keep the file open (flushing occasionally), rather than constantly open/close. For example:

class MeaningfulName : IDisposable {
    FileStream file;
    readonly object syncLock = new object();
    public MeaningfulName(string path) {
        file =  new FileStream(fileName, FileMode.Append, FileAccess.Write,
           FileShare.ReadWrite, 8192);
    }
    public void Dispose() {
        if(file != null) {
           file.Dispose();
           file = null;
        }
    }
    public void Append(byte[] buffer) {
        if(file == null) throw new ObjectDisposedException(GetType().Name);
        lock(syncLock) { // only 1 thread can be appending at a time
            file.Write(buffer, 0, buffer.Length);
            file.Flush();
        }
    }
}

That is thread-safe, and could be made available to all the ashx without issue.

However, for larger data, you might want to look at a synchronized reader-writer queue - i.e. all the writers (ashx hits) can throw data onto the queue, with a single dedicated writer thread dequeuing them and appending. That removes the IO time from the ashx, however you might want to cap the queue size in case the writer can't keep up. There's a sample here of a capped synchronized reader/writer queue.

Community
  • 1
  • 1
Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
  • OK, adding the lock and keeping the file open seems to do the trick without decreasing performance, but there is another issue now. There is more that one file to write to, and it is switched each hour (new file is created), so I need thread-safe collection for storing opened filestreams, and adding new one if is required. As for now I use ConcurrentDictionary. Wondering if there is something faster? – PanJanek Dec 09 '11 at 08:14
  • @PanJanek are you ever writing to anything other than the current hour? I'm not sure why you need a dictionary... – Marc Gravell Dec 09 '11 at 08:21
  • Apart from changing file from hour to hour I write request from different business clients (identified by one of request parameters) to different files. There are about 2-3 different files opened each hour. – PanJanek Dec 09 '11 at 08:50
0

Unless you're using a web garden or web farm, I'd suggest using process-local locking (lock(){}), and perform as much processing as possible outside of the lock.

If you have multiple files you're writing to, see Better solution to multithreading riddle? for a good solution.

Community
  • 1
  • 1
Lilith River
  • 16,204
  • 2
  • 44
  • 76