I'm moving a whole bunch of files from one place to another, some of them pretty large .wav files, and changing around the directory structure at the destination, so I couldn't copy directories wholesale. I originally was using the copyFile function recommended here: http://blogs.blumetech.com/blumetechs-tech-blog/2011/05/faster-python-file-copy.html
def copyFile(src, dst, buffer_size=10485760, perserveFileDate=True):
'''
Copies a file to a new location. Much faster performance than Apache Commons due to use of larger buffer
@param src: Source File
@param dst: Destination File (not file path)
@param buffer_size: Buffer size to use during copy
@param perserveFileDate: Preserve the original file date
'''
# Check to make sure destination directory exists. If it doesn't create the directory
dstParent, dstFileName = os.path.split(dst)
if(not(os.path.exists(dstParent))):
os.makedirs(dstParent)
# Optimize the buffer for small files
buffer_size = min(buffer_size,os.path.getsize(src))
if(buffer_size == 0):
buffer_size = 1024
if shutil._samefile(src, dst):
raise shutil.Error("`%s` and `%s` are the same file" % (src, dst))
for fn in [src, dst]:
try:
st = os.stat(fn)
except OSError:
# File most likely does not exist
pass
else:
# XXX What about other special files? (sockets, devices...)
if shutil.stat.S_ISFIFO(st.st_mode):
raise shutil.SpecialFileError("`%s` is a named pipe" % fn)
with open(src, 'rb') as fsrc:
with open(dst, 'wb') as fdst:
shutil.copyfileobj(fsrc, fdst, buffer_size)
if(perserveFileDate):
shutil.copystat(src, dst)
This was still taking forever, so just experimenting, I replaced it with this:
def copyFile(src, dst):
os.system('cp -p %s %s' % (src, dst))
I wound up with about a 10x speedup! The first version was copying at about 22~25 KBps, and the second version is now at about 220 KBps. This is a just fine hack for myself, but I'd like to understand better why incase I need to develop & share code like this in the future.