0

I copied this script from some book to make tar.bz2 of some folders for backup.

#!/usr/bin/env python
import tarfile, os

def make_tar(folder_to_backup, dest_folder, compression='bz2'):
        if compression:
                dest_ext ='.' + compression
        else:
                dest_ext = ''
        arcname = os.path.basename(folder_to_backup)
        dest_name = '%s.tar%s' % (arcname, dest_ext)
        dest_path = os.path.join(dest_folder, dest_name)
        if compression:
                dest_cmp = ':' + compression
        else:
                dest_cmp = ''

        out = tarfile.TarFile.open(dest_path, 'w' +dest_cmp)
        out.add(folder_to_backup, arcname)
        out.close()
        return dest_path

print "Doing Python"
make_tar('/home/bob/public_html','/home/bob/testbck', compression='bz2')

Now bash take 40 second to make backup of that folder and python takes around 8 minutes.

Am i wrong somewhere or python is always slower for these tasks

Mirage
  • 561
  • 4
  • 10
  • 25

1 Answers1

1

I copied/pasted your code and tried with both bz2 and gz against tar cjpf and tar czpf respectively and found them to perform the same. Which version of Python are you using? How many files are there on /home/bob/public_html? Did you try the tar command first and then your script or the other way around? (I'm guessing file caches may skew the results a bit, but not so much though).

I just took a look to TarFile's implementation. It's easy with ipython, by the way:

import tarfile
%edit tarfile.TarFile.add

And this is the case for directories:

    elif tarinfo.isdir():
        self.addfile(tarinfo)
        if recursive:
            for f in os.listdir(name):
                self.add(os.path.join(name, f), os.path.join(arcname, f), recursive, exclude)

Which I can see getting slower as the number of total files increases. I'm guessing tar may be more optimized when handling this case. It's just a guess, though.

Eduardo Ivanec
  • 14,881
  • 1
  • 37
  • 43
  • its working now , actually i was compairing linux tar.gz with pyhton tar.bz2 which had different times , now i have chnaged to gz and its almost same. that was my first python script . very happy now – Mirage May 16 '11 at 12:37
  • Ah, that explains it of course! I should have thought of that possibility. Have fun with Python and take a look at `ipython`, it lets you try your code on the go much more easily than the standard interpreter. You can also call it from your scripts to "debug" interactively. – Eduardo Ivanec May 16 '11 at 12:43
  • i am using linux termnal by writing .py files and then executing. how can ipython help me . does it has more commands and also how will my .py script chnage if i want to use ipython – Mirage May 16 '11 at 16:53
  • No new commands or anything - it's just a way to execute python interactively, allowing greater flexibility when developing. No need to use it for your script at all, it was more of a general recommendation. – Eduardo Ivanec May 16 '11 at 17:15