0

I am working in a project involving data acquisition. One very important requisite is described like this:

  1. At the beginning of the recording, a file must be created, and its headers must be written;
  2. As soon as the data-acquisition starts, the application should keep saving collected data to file periodically (typically once per second);
  3. Writing consist on appending data blocks to the file, atomically if possible;
  4. Should any error occur (program error, power failure), the file must be kept valid untill the last write before the error.

So I plan to use some Thread to watch for data received and write this data do file, but I don't know which practice is best (code below is not real, just to get the feeling):

First option: Single open

using (var fs = new FileStream(filename, FileMode.CreateNew, FileAccess.Write))
    fs.Write(headers, 0, headers.Length);

using (var fs = new FileStream(filename, FileMode.Append, FileAccess.Write))
{
    while (acquisitionRunning)
    {
        Thread.Sleep(100);
        if (getNewData(out _someData;))
        {                        
            fs.Write(_someData, 0, _someData.Length);
        }
    }
}

Second option: multiple open:

using (var fs = new FileStream(filename, FileMode.CreateNew, FileAccess.Write))
    fs.Write(headers, 0, headers.Length);

while (acquisitionRunning)
{
    Thread.Sleep(100);
    if (getNewData(out _someData;))
    { 
        using (var fs = new FileStream(filename, FileMode.Append, FileAccess.Write))
        {                       
            fs.Write(_someData, 0, _someData.Length);
        }
    }
}

The application is supposed to work in a client machine, and file access by other processes should not be a concern. What I am most concerned about is:

  1. Do multiple open/close impact performance (mind that typical write interval is once per second);
  2. Which one is best to keep file integrity safe in the event of failure (including explicitly power failure)?
  3. Is any of this forms considered a particularly good or bad practice, or either can be used depending on specifics of the problem at hand?
heltonbiker
  • 26,657
  • 28
  • 137
  • 252
  • Closely related, but using perl (would it be different for C#?): http://stackoverflow.com/questions/4867468/should-i-keep-a-file-open-or-should-i-open-and-close-often – heltonbiker Mar 05 '15 at 14:58
  • 1
    Good question. 1. multiple opening and closing of files do impact performance. e.g. looking up files on hdd's take time. 2. I would recon that when spreading the time to perform disk access (read/write) over multiple files the chance that one of the files gets corrupt with a power failure would increase. 3. either can be used depending on specific problems at hand. – Blaatz0r Mar 05 '15 at 15:01

2 Answers2

2

A good way to preserve file content in the event of a power outage/etc, is to flush the filestream after each write. This will make sure the contents you just wrote to the stream get immediately written to disk.

1

As you've mentioned, other processes won't be accessing the file, so keeping it open wouldn't complicate things, and also it would be faster. But, keep in mind that if the app crashes the lock will remain on the file and you might probably need to handle this accordingly to your scenario.

pangular
  • 699
  • 7
  • 27
  • Besides the locking issue (which I will research a bit on how to handle), would the _file content_ integrity, since the last write, be preserved? – heltonbiker Mar 05 '15 at 16:24
  • It might leave the file in corrupted state. But this may be also acceptable in certain situations. You should probably check e.g. how log4net implements file locking. This will bring you some additional options and ideas. – pangular Mar 05 '15 at 19:36