6

I have a little script that moves files around in my photo collection, but it runs a bit slow.

I think it's because I'm doing one file move at a time. I'm guessing I can speed this up if I do all file moves from one dir to another at the same time. Is there a way to do that?

If that's not the reason for my slowness, how else can I speed this up?

Update:

I don't think my problem is being understood. Perhaps, listing my source code will help explain:

# ORF is the file extension of the files I want to move;
# These files live in dirs shared by JPEG files,
# which I do not want to move.
import os
import re
from glob import glob
import shutil

DIGITAL_NEGATIVES_DIR = ...
DATE_PATTERN = re.compile('\d{4}-\d\d-\d\d')

# Move a single ORF.
def move_orf(src):
    dir, fn = os.path.split(src)
    shutil.move(src, os.path.join('raw', dir))

# Move all ORFs in a single directory.
def move_orfs_from_dir(src):
    orfs = glob(os.path.join(src, '*.ORF'))
    if not orfs:
        return
    os.mkdir(os.path.join('raw', src))
    print 'Moving %3d ORF files from %s to raw dir.' % (len(orfs), src)
    for orf in orfs:
        move_orf(orf)

# Scan for dirs that contain ORFs that need to be moved, and move them.
def main():
    os.chdir(DIGITAL_NEGATIVES_DIR)
    src_dirs = filter(DATE_PATTERN.match, os.listdir(os.curdir))
    for dir in src_dirs:
        move_orfs_from_dir(dir)

if __name__ == '__main__':
    main()
allyourcode
  • 21,871
  • 18
  • 78
  • 106

3 Answers3

4

What platform are you on? And does it really have to be Python? If not, you can simply use system tools like mv (*nix) , or move (windows).

$ stat -c "%s" file
382849574

$ time python -c 'import shutil;shutil.move("file","/tmp")'

real    0m29.698s
user    0m0.349s 
sys     0m1.862s 

$ time mv file /tmp

real    0m29.149s
user    0m0.011s 
sys     0m1.607s 

$ time python -c 'import shutil;shutil.move("file","/tmp")'

real    0m30.349s
user    0m0.349s 
sys     0m2.015s 

$ time mv file /tmp

real    0m28.292s
user    0m0.015s 
sys     0m1.702s 

$ cat test.py
#!/usr/bin/env python
import shutil
shutil.move("file","/tmp")
shutil.move("/tmp/file",".")

$ cat test.sh
#!/bin/bash
mv file /tmp
mv /tmp/file .

# time python test.py

real    1m1.175s
user    0m0.641s
sys     0m4.110s

$ time bash test.sh

real    1m1.040s
user    0m0.026s
sys     0m3.242s

$ time python test.py

real    1m3.348s
user    0m0.659s
sys     0m4.024s

$ time bash test.sh

real    1m1.740s
user    0m0.017s
sys     0m3.276s
ghostdog74
  • 327,991
  • 56
  • 259
  • 343
  • 3
    There's no particular reason that would be any faster than doing it in Python; it's generally going to be I/O-bound. – Glenn Maynard Oct 09 '10 at 08:48
  • In all the tests on my linux box, using system `mv` is faster than Python `shutil.move`. – ghostdog74 Oct 09 '10 at 09:35
  • 2
    @user131527: It sounds like he has a script that's locating particular files and moving them. In that case (since he's already in python) `shutil.move(stuff)` is cleaner & safer to write than `os.system('mv stuff');` Once you're already running python interpreter, the difference is moot since shutil.move just calls the system's move. – JoshD Oct 09 '10 at 20:57
  • That's why i ask whether using Python is a definite must, right? If not, using the shell's mv command instead of Python. – ghostdog74 Oct 10 '10 at 01:57
  • I'm sure this can be done in shell (I assume it's Turing complete), but I see no reason why this should be slow in Python. – allyourcode Jun 12 '11 at 23:04
3

Edit:

In my own state of confusion (which JoshD helpfully remedied), I forgot that shutil.move accepts directories, so you can (and should) just use that to move your directory as a batch.

Jim Brissom
  • 31,821
  • 4
  • 39
  • 33
  • 3
    I think he wants to move rather than copy... maybe. In that case a simple move is **much** faster than a copy then delete. – JoshD Oct 09 '10 at 07:35
  • Technically 'copy' should be faster than 'move'. Since move does 'copy+delete'. One way to speed your program. – Srikar Appalaraju Oct 09 '10 at 07:41
  • 11
    @movieyoda: I take it you've not moved 20GB directories then copied the same 20GB directory, have you? Move (on the same disk) is simply a rename. – JoshD Oct 09 '10 at 07:44
2

If you just want to move the directory, you can use shutil.move. It'll be pretty freakin' quick (if it's on the same filesystem) because it's just a rename operation.

JoshD
  • 12,490
  • 3
  • 42
  • 53
  • 1
    On the same filesystem, to be exact. – AndiDog Oct 09 '10 at 19:33
  • 1
    Oh and by the way, `shutil.move` does `try: os.rename(...) except OSError: ...copy and delete...` automatically, so there's no reason for using `os.rename` in 99% of the cases. – AndiDog Oct 09 '10 at 19:37
  • @AndiDog: Thanks for clarifying those details. I'll update the answer with the more accurate information. – JoshD Oct 09 '10 at 20:51