1

I've seen questions asked here before about Python and copying files, but I have a different scenario to deal with.

I'm almost done with a Linux distro installer I've been working on, and now all it needs to do is copy the files over to the destination partition. As most distro installers have a progress bar, I was hoping to add one too.

Right now, I'm using PyQt4 and my code looks like this:

self.status('Counting files...')
self.count = int(check_output(['-c', 'find /opt/linux/work/root-image/ -type f | wc -l'], stderr = PIPE, shell = True))

self.status('Copying files...')

i = 0

for root, dirs, files in os.walk('/opt/linux/work/root-image/'):
  for file in files:
    i += 1
    f = os.path.join(root, file)

    try:
      os.system('mkdir -p /tmp/foo' + os.path.split(f)[0])
    except:
      pass

    os.system('cp ' + f + ' /tmp/foo' + f)

    if i % 100 == 0:
      self.emit(SIGNAL('progress(int)'), int(100.0 * float(i) / float(self.count)))

self.status('Done...')

It's quite inefficient because of the progress bar. The whole image is 2.1GB, and it takes the script a really long time to copy the files over. Much longer than a simple cp -r.

Is there any efficient way to do this? For single-file copy progressbars, all you do is read little chunks at a time, but I have no idea how to do that for a directory with 91,489 files.

Any help would be helpful. Thanks!

Blender
  • 289,723
  • 53
  • 439
  • 496

1 Answers1

1

You could try using shutil.copy to copy files instead of calling out to the OS using os.system (which creates a separate process). You can also use os.mkdir to create new directories. However, are you sure that it is slow because of the progress bar and not something else?

Tamás
  • 47,239
  • 12
  • 105
  • 124
  • The signal emitting slows things down quite a lot. But I'll try those changes, maybe the processes are really slowing it down this time. – Blender May 23 '11 at 08:25
  • 1
    If you have 91489 files and you are firing a signal after every 100 files, then that's only 914 signal emissions or so -- it shouldn't be a problem. Is it much faster if you comment out the ``self.emit`` part in the for loop and copy files one by one using ``os.system`` as above? – Tamás May 23 '11 at 08:27
  • I'm testing it now. Seems to get stuck on broken symlinks (which work if the root directory was the source, but that's not working on the host system). – Blender May 23 '11 at 08:32
  • Apparently you have to handle such corner cases yourself, i.e. by checking in advance whether the file is a broken symlink (``os.path.exists`` should return ``False``) and then making the symlink manually using ``os.symlink``. – Tamás May 23 '11 at 09:43
  • The symlinks work if the root folder `/` *is* the folder I'm copying from, as the folder I'm copying from is a Linux FS. Python doesn't like the broken symlinks, so I guess I'll have to manually `cp` them... – Blender May 23 '11 at 15:27