7

Is there an open-source project or best-practices guide shows the fastest way to copy files around a local machine, lan, san, and wan, that can rival the speed of the built-in xcopy of windows7 (or 8) or windows explorer copy?

To be blunt, not all file IO is created equal. There are different overheads in certain protocols and techniques. Some libraries don't take advantage of asynchronous operations or taking advantage of the line speed of the hardware.

I'm taking inventory of the large data transfers we use and trying to rate the effectiveness of our client applications and the applications from external vendors. Certain server applications are the worst offenders (java-based being the worst of the worst).

I'm limiting the scope of this research to SMB 2 and 3 (cifs on windows7 and 8).

  • Is there a disadvantage in speed in using POSIX libraries. (fread, fopen, fseek, etc)
  • Is there any advantage to using win32 calls (CopyFile2, ReadFileEx)
Ben L
  • 1,449
  • 1
  • 16
  • 32
  • 4
    It looks as if xcopy uses an undocumented API function, PrivCopyFileExW. However, the documented CopyFileEx probably performs very similarly. – Harry Johnston Jul 17 '12 at 22:32

1 Answers1

2

xcopy actually is not the fastest way to copy files, especially across disks or across a local network. There's a commercial product called TeraCopy that is much faster. It's closed-source so I don't know entirely how it works but one of the main differences is that instead of using a single loop to read a chunk of data to a memory buffer and then write that buffer to the new location, it uses two threads and a producer/consumer queue.

The producer reads chunks of the source file and puts them into a queue. The consumer reads from the queue and writes to the target. The advantage here is that reading and writing can be done concurrently. You do need to be careful though and have the producer keep an eye on the queue size and not make the queue too big to use up too much memory--usually reading will be faster than writing, but that also depends on the source and destination locations.

Samuel Neff
  • 73,278
  • 17
  • 138
  • 182