3

I am working on a C# project in which I have several configuration files. Each of these files contains a JSON Object. During the whole lifecycle of the program, these files can be read or be written at various moments.

This program manages an industrial machine which for various reasons can be turned off at any moment. Turning off the machine leads to instantly turning off the computer on which my program is running. The computer is running Windows 10 Pro x64 with a NTFS formatted SSD.

When the machine is turned on, and thus my program restarts, it throws an exception when reading a configuration file telling the file does not contain any JSON object. When I open the file with Notepad, the file really is "empty". For example, instead of having a JSON Object:

{ "key": value }

I have the following content:

NULNULNULNULNULNULNULNUL etc.

The properties of the file show the same filesize whether it contains a JSON object or is "empty", the same goes for the size on disk property.

I have other configuration files that are read and written but as plain text, which are not affected.

This issue does not arise at each power off / power on, and does not affect each configuration file. It mostly appears with the same file but not always.

I've checked if the configuration files are correctly closed whenever I read or write them:

Read file:

JObject jsondata = JObject.Parse(File.ReadAllText(Path));

Write file:

File.WriteAllText(Path, jsondata.ToString());

Both methods (ReadAllText and WriteAllText) specify that they open, read and close the file.

These methods are surrounded with try catch clauses and I never had an issue with a wrong JSON structure or a NULL Object. If I'm correct, even a NULL JSON object would write at least the brackets {} into the file.

I've tried to programmatically backup my configuration files in another folder. Backing up files are done without reading the files (using the File.Copy() method):

  • Periodically (every 10 minutes), update the backup files with the latest configuration files.

  • If a configuration file is "empty" (by checking if all bytes in file equal 0), replace it with the corresponding backup file.

        // Check if any file has been modified since last check
        for (int file = 0; file < Directory.GetFiles(_FolderToBackup).Length; ++file)
        {
            // Get file to check
            string FilePath = Directory.GetFiles(_FolderToBackup)[file];
            string FileName = Path.GetFileName(FilePath);
    
            // Check if backup file with same name exists in Backup folder
            if (BackupFileExist(FileName))
            {
                // File path to backup file
                string BackupFilePath = _BackupFolder + "\\" + FileName;
    
                // If backup file is empty
                if (isFileEmpty(BackupFilePath))
                {
                    Log.Write("File " + FilePath + " is empty");
    
                    // Copy file to backupfolder, we don't have to check if file to backup is empty, because destination is already empty !
                    File.Copy(FilePath, BackupFilePath, true);
                }
    
                // If file to backup is empty
                if (isFileEmpty(FilePath))
                {
                    Log.Write("File " + FilePath + " is empty");
    
                    // Copy backup file back to folder to backup
                    File.Copy(BackupFilePath, FilePath, true);
                }
    
                // If no file is empty, update only files that have been modified since last check
                if(new FileInfo(FilePath).LastWriteTime > new FileInfo(BackupFilePath).LastWriteTime)
                {
                    File.Copy(FilePath, BackupFilePath, true);
                }
            }
    
            // If backup file does not exist
            else
            {
                string BackupFilePath = Path.Combine(_BackupFolder, FileName);
                File.Copy(FilePath, BackupFilePath);
            }
        }
    

This turnaround works perfectly, when a configuration file is "empty". However, sometimes when I turn off/on the machine, both the configuration file and it's backup file were empty.

I also managed once to obtain an empty configuration file on machine restart even if the power off happened while my code wasn't running.

At this point, I don't know if my issue is related to the power off/on or the way I read/write my files:

  • Why does it happen when the computer is shut down / turned on ?

  • Why does it affect only my JSON configuration files ?

  • Why does it empty the files and not corrupt them ?

  • Why does it happen even if the file is not open in my program ?

Thank you very much for your time.

Nicola
  • 91
  • 8
  • What filesystem are these files being written to/read from, and on what version of Windows? – Tao Jan 07 '19 at 17:05
  • Windows 10 Pro x64 running on a NTFS formatted SSD. – Nicola Jan 07 '19 at 17:15
  • When an IO method says it closes the file, it does not necessarily mean **immediately**. Similarly for writes: IO is expensive and these operations are either deferred or buffered. – Ian Kemp Jan 07 '19 at 18:11
  • BTW, there is a simple solution to your problem that doesn't involve any code... it's called a UPS. Also probably cheaper, and will prevent any other filesystem corruption on your SSD. – Ian Kemp Jan 07 '19 at 18:13
  • @Ian Kemp, I have thought about a UPS, but has several drawbacks: 1) Battery life 2) increasing costs of our machines 3) Dust & heat – Nicola Jan 07 '19 at 18:26
  • @Nicola A UPS that you have to replace every 6 months will almost certainly end up being **cheaper** than the dev time you're spending to try to solve this issue! Sometimes pure economics is the simplest solution. :) – Ian Kemp Jan 09 '19 at 07:23
  • Not really, the answer Jesse C. Slicer provided me works and only took me 5 minutes to put in place. However, putting into place a UPS (redesigning our machine, configuring the computer to shutdown, extra maintainance) will be far more expensive. – Nicola Jan 09 '19 at 12:45

2 Answers2

3

Looking at the source for File.WriteAllText(), it seems that your data could be the victim of buffering (seems to be a 1K buffer size). If you want to guarantee immediate writing to disk, you'll need your own method:

    using (Stream stream = File.Create(yourPath, 64 * 1024, FileOptions.WriteThrough))
    using (TextWriter textWriter = new StreamWriter(stream))
    {
        textWriter.Write(jsonData);
    }
Jesse C. Slicer
  • 19,901
  • 3
  • 68
  • 87
  • Thank you for your answer. If I understand correctly, WriteAllText() would not write all the content into the file when the method is called, but places the data in a temporary buffer ? – Nicola Jan 07 '19 at 18:16
  • That is correct - and I'm not referring to C# or .NET here, but rather at the OS level. The `FileOptions.WriteThrough` indicates it is to be written immediately. – Jesse C. Slicer Jan 07 '19 at 18:49
  • @JesseC.Slicer - are you saying that with "WriteThrough", writes become atomic? Or just that the window of errors/problems is likely to become (much) smaller? – Tao Jan 08 '19 at 07:24
  • @Tao the latter. – Jesse C. Slicer Jan 08 '19 at 14:35
  • I've tested this solution a couple dozen times and it seems to work. – Nicola Jan 08 '19 at 18:25
0

Non-authoritative answer, but googling "non-atomic writes windows" I stumble across a really interesting article that suggests what you're experiencing is reasonably normal even on NTFS: https://blogs.msdn.microsoft.com/adioltean/2005/12/28/how-to-do-atomic-writes-in-a-file/

If I've understood correctly, then for your use-case what it recommends you do, is:

  • Do your writes (your JSON config file write) to a temporary file
    • (if power fails here, you've just lost this round of changes, the original file is fine)
  • "Flush the writes" (not sure what the right way to do that is, in your environment, but this question explores exactly that: How to ensure all data has been physically written to disk? ), or do the write with FileOptions.WriteThrough as outlined by @JesseC.Slicer
    • (if power fails here, you've just lost this round of changes, the original file is fine)
  • Rename the original file to an "I know I'm doing something dangerous" naming format, eg with a specific suffix
    • (if power fails here, you don't have a main config file, you've lost this round of changes, but you can still find the backup)
  • Rename the temporary file to the final/original name
    • (if power fails here, you have a main updated config file AND a redundant outdated "temporarily renamed" file)
  • Delete the temporarily renamed file

All this of course assumes you're able to ensure the temp file is fully written before you start renaming things. If you've managed that, then at startup your process would be something like:

  • If a "temporarily renamed" file is found, then either delete it (if there is also a "main file"), or rename it to the main file name
  • Load the main file (should never be corrupted)
Tao
  • 13,457
  • 7
  • 65
  • 76
  • If I understand correctly, there will always be a rotation around 3 files which will be copied and renamed. I will give it a try but I am a bit afraid that multiplying write and copy / renaming files will increase the risk of errors. – Nicola Jan 07 '19 at 18:30
  • There are only 2 files involved - the original/previous one, and the new one. They do get renamed a couple times, so there are 3 filenames involved in a given cycle, yes. The point of this arrangement is to ensure that you *always* have a single file with fully written and consistent data, and you know which one it is. That said, there IS additional complexity here as you noted - @JesseC.Slicer's suggestion is *much* simpler, and probably/presumably drops the rate of problems low enough for most purposes... – Tao Jan 08 '19 at 16:34