when programming a kind of backup application, I did an evaluation of file copying performance on Windows.
I have several questions and I wonder about your opinions.
Thank you!
Lucas.
Questions:
Why is the performance so much slower when copying the 10 GiB file compared to the 1 GiB file?
Why is shutil.copyfile so slow?
Why is win32file.CopyFileEx so slow? Could this be because of the flag win32file.COPY_FILE_RESTARTABLE? However, it doesn't accept the int 1000 as flag (COPY_FILE_NO_BUFFERING), which is recommended for large files: http://msdn.microsoft.com/en-us/library/aa363852%28VS.85%29.aspx
Using an empty ProgressRoutine seems to have no impact over using no ProgressRoutine at all.
Is there an alternative, better-performing way of copying the files but also getting progress updates?
Results for a 1 GiB and a 10 GiB file:
test_file_size 1082.1 MiB 10216.7 MiB
METHOD SPEED SPEED
robocopy.exe 111.0 MiB/s 75.4 MiB/s
cmd.exe /c copy 95.5 MiB/s 60.5 MiB/s
shutil.copyfile 51.0 MiB/s 29.4 MiB/s
win32api.CopyFile 104.8 MiB/s 74.2 MiB/s
win32file.CopyFile 108.2 MiB/s 73.4 MiB/s
win32file.CopyFileEx A 14.0 MiB/s 13.8 MiB/s
win32file.CopyFileEx B 14.6 MiB/s 14.9 MiB/s
Test Environment:
Python:
ActivePython 2.7.0.2 (ActiveState Software Inc.) based on
Python 2.7 (r27:82500, Aug 23 2010, 17:17:51) [MSC v.1500 64 bit (AMD64)] on win32
source = mounted network drive
source_os = Windows Server 2008 x64
destination = local drive
destination_os = Windows Server 2008 R2 x64
Notes:
'robocopy.exe' and 'cmd.exe /c copy' were run using subprocess.call()
win32file.CopyFileEx A (using no ProgressRoutine):
def Win32_CopyFileEx_NoProgress( ExistingFileName, NewFileName):
win32file.CopyFileEx(
ExistingFileName, # PyUNICODE | File to be copied
NewFileName, # PyUNICODE | Place to which it will be copied
None, # CopyProgressRoutine | A python function that receives progress updates, can be None
Data = None, # object | An arbitrary object to be passed to the callback function
Cancel = False, # boolean | Pass True to cancel a restartable copy that was previously interrupted
CopyFlags = win32file.COPY_FILE_RESTARTABLE, # int | Combination of COPY_FILE_* flags
Transaction = None # PyHANDLE | Handle to a transaction as returned by win32transaction::CreateTransaction
)
win32file.CopyFileEx B (using empty ProgressRoutine):
def Win32_CopyFileEx( ExistingFileName, NewFileName):
win32file.CopyFileEx(
ExistingFileName, # PyUNICODE | File to be copied
NewFileName, # PyUNICODE | Place to which it will be copied
Win32_CopyFileEx_ProgressRoutine, # CopyProgressRoutine | A python function that receives progress updates, can be None
Data = None, # object | An arbitrary object to be passed to the callback function
Cancel = False, # boolean | Pass True to cancel a restartable copy that was previously interrupted
CopyFlags = win32file.COPY_FILE_RESTARTABLE, # int | Combination of COPY_FILE_* flags
Transaction = None # PyHANDLE | Handle to a transaction as returned by win32transaction::CreateTransaction
)
def Win32_CopyFileEx_ProgressRoutine(
TotalFileSize,
TotalBytesTransferred,
StreamSize,
StreamBytesTransferred,
StreamNumber,
CallbackReason, # CALLBACK_CHUNK_FINISHED or CALLBACK_STREAM_SWITCH
SourceFile,
DestinationFile,
Data): # Description
return win32file.PROGRESS_CONTINUE # return of any win32file.PROGRESS_* constant