0

I have a file which I am copying to some location. Below is the code snippet -

//Document Status = Pending
var triggerFileWriter = new StringWriter();
triggerFileWriter.WriteLine("Only for test");
System.IO.File.WriteAllText(fullTriggerFilename, triggerFileWriter.ToString());
triggerFileWriter.Dispose();

if (System.IO.File.Exists(fullTriggerFilename))
{
    // Document Status = Processed
}

Is File.Exists check sufficient to update the document status?
I am not worried about if file is not copied over and document status not updated. Because there is a timer job running every 10 minutes, 'Pending' items will be automatically picked up in the next run.

Is there any possibility of file copying being interrupted - which can result in a file but not actually copied completely? What changes I can make to my code to address if that happens.

Thank you!

inutan
  • 10,558
  • 27
  • 84
  • 126
  • There is *write cache*, so you have to [flush](http://stackoverflow.com/q/383324/1997232) to ensure what it's really safe. Apart from not getting exceptions during `WriteAllText` of course. – Sinatr Jul 01 '15 at 12:26
  • @Sinatr `Flush` doesn't wait for the HDD to physically save the data either (anymore). On modern systems, there simply isn't a way to be sure unless you have a way of checking the file integrity - or relying on file systems that allow atomic file creation (like NTFS). More importantly, it doesn't solve the OP's problem - he doesn't need to ensure the file is properly written, he just needs to know whether the file he's about to process now does or doesn't need to be processed. – Luaan Jul 01 '15 at 12:29
  • @Luaan, did you check the linked question in my comment? – Sinatr Jul 01 '15 at 12:30
  • @Sinatr Yup. `FlushFileBuffers` *still* doesn't ensure the file being physically written. There's no way even for the *OS* to force the HDD to save the internal buffer. True, it forces Windows to *send* the data to the HDD, but the HDD has its own internal caches. The uses for `FlushFileBuffers` are quite niche - and for people who want to feel more in control than they really are :D – Luaan Jul 01 '15 at 12:32
  • @Sinatr Or, to be more exact, it *might* work, or it *might not*. And in any case, you're killing your performance - caching exists for a reason. A nice article on the topic is http://blogs.msdn.com/b/oldnewthing/archive/2010/09/09/10059575.aspx - and I'm pretty sure there's devices that ignore even those messages nowadays (though I can't find the reference to *that*). Of course, if you want to force this in an application like this, you wouldn't use `FlushFileBuffers` - you'd simply disable the buffering with `FILE_FLAG_WRITE_THROUGH`. – Luaan Jul 01 '15 at 12:40
  • 1
    @Luaan, *HDD internal cache* is the key. OS function probably doesn't go that far (and/or some manufacturers will cheat it anyway to get higher benchmark). You are right. I wouldn't consider performance when data integrity might be of the highest priority, but flushing simply won't work. And UPS may fail too (it may fail simultaneously with power failure ^^). The only way left is to use `File.Exists`, but provide another means of data integrity, e.g. [making backups](http://stackoverflow.com/q/7957544/1997232). – Sinatr Jul 01 '15 at 12:45

3 Answers3

1

Well, the only way to know for sure is to compare the whole file, byte-by-byte, to the file you're trying to write. This is not exactly cheap, of course - you could have just as easily overwritten the file anyway.

On NTFS, files that weren't properly "committed" are basically deleted, so the File.Exists is fine. This may not be the case when using e.g. FAT-32, or when saving over a networked file system.

File size might help in that case - unless you pre-allocate the file in advance (which is quite a good practice for performance). Even without pre-allocating, it's quite possible for the file to be sized properly, but still missing data.

Luaan
  • 62,244
  • 7
  • 97
  • 116
  • Thanks for your reply. Actually the file is being copied over a networked file system - to another server. Can I rely on File.Exists? – inutan Jul 01 '15 at 12:33
  • @iniki That mainly depends on whether you have control over the remote side as well. If it's a Windows server with NTFS, you should be safe. – Luaan Jul 01 '15 at 12:34
0

You can use an hash function such as SHA or MD5 on the original file and store it. Then apply the same hash function on the copied file and compare the two hashes. They must be identical.

Cheshire Cat
  • 1,941
  • 6
  • 36
  • 69
0

You're calling File.WriteAllText method. That means your job will be done or you'll get an exception. So, you have a guarantee given by .NET I/O API that file was properly written.

But you'll never have guarantee that it exists in some resource. So you don't need to call File.Exists. Just don't rely on this. Everything can happen.

astef
  • 8,575
  • 4
  • 56
  • 95