1

Is there a way in .NET to backup a directory containing multiple sub directories each containing potentially 10,000 or more files of roughly 100kb-500kb in size without enumerating? The use case here is incrementally backing up files to USB storage and a NAS, but due to file count, it can take a really long time. I'm familiar with VSS and have created some custom backup applications utilizing it, but I was wondering if there is a way to snapshot a volume containing these files and save just the snapshot, without having to expose the snapshot as a mounted image and copy each file. End game is to shorten the amount of time the copy operation takes.

Smitty
  • 1,765
  • 15
  • 22
  • Did you check if performance of `ZipFile.CreateFromDirectory` is acceptable? – Eugene Komisarenko Apr 21 '17 at 20:53
  • @EugeneKomisarenko - I had thought about that actually and am still mulling it over. I could archive everything to the zip and even append to it from what I understand. I'd have to re-create the archive at some frequency to make sure I had the latest versions of altered files. I left out in my question that this would likely take the form of a incremental backup. – Smitty Apr 21 '17 at 20:55

3 Answers3

1

An MFT (master file table) record is only 1K or 1,024 bytes in size, so retrieving the name, attributes, etc. only needs a tiny bit of I/O in comparison to the reads required to move the file data itself. So enumerating over file directory records is not the problem, and eliminating that part of the processing won't cause more than a tiny flicker on your speedometer, so to speak.

I suppose you could try a low level BIOS "clone" (not copy) of the disk data itself, using cloning software. This may be a bit faster because the disk head won't have to zig zag around. On the other hand, this option will have nearly no effect if your hard disk is solid state, since random access will be as fast as sequential access.

Another option is to back up only files that have changed, by inspecting each file's last modified date time stamp. An easy way to do this is to use the XCOPY command with the /D switch (with no argument) which will compare the date/time of the last backup with the current file and only perform the copy if the file on your hard disk is newer than the file on the external drive.

John Wu
  • 50,556
  • 8
  • 44
  • 80
  • So it's really the copy operation that is taking time, not the enumeration. Thanks for that clarification - I had heard of xcopy and robocopy, but had developed something using FileSystemWatcher to catch change events in a directory and its subdirectories. I'll see if I can not reinvent the wheel. – Smitty Apr 21 '17 at 21:00
1

You can call a different process from .Net and use RoboCopy to move the files across. It has lot of parameters like specifying number of threads to be used, or Timestamp checking etc.

public static void Main()
{
    Process myProcess = new Process();

    try
    {
        myProcess.StartInfo.UseShellExecute = false;
        myProcess.StartInfo.FileName = "C:\\Windows\\System32\\robocopy.exe Source Destination";
        myProcess.StartInfo.CreateNoWindow = true;
        myProcess.Start();
    }
    catch (Exception e)
    {
        Console.WriteLine(e.Message);
    }
}
loneshark99
  • 706
  • 5
  • 16
1

Windows Server Backup Feature might be a good alternative to custom built solutions. If you are running Windows Server 2016 you may find this article explaining the new features interesting too.

Eugene Komisarenko
  • 1,533
  • 11
  • 25
  • I'm going to seriously look at this. It's block level apparently so that's better than individual file level ops. – Smitty Apr 21 '17 at 21:58
  • 1
    It looks like I'm going to go this route. I've found where I can actually write a little .NET app to invoke some of the [Windows Server Backup PowerShell](https://msdn.microsoft.com/en-us/library/gg241214(v=vs.85).aspx) commands that can run scheduled and 'one-time' backups. The reason this is an attractive solution is the fact that Windows Server Backup utilizes Volume Shadow Copy which is a good fit, since some files will definitely be read locked. Plus, I'm only having to deal with a VHD image and not individual files being copied from an exposed snapshot. – Smitty Apr 23 '17 at 18:18