Unfortunately, you can't get progress on the compression of each individual file from the zipfile module, but you can get an idea of the total progress by keeping track of how many bytes you've processed so far.
As Mikko Ohtamaa suggested, the easiest way to do this is to walk through the file list twice, first to determine the file sizes, and second to do the compression. However, as Kevin mentioned the contents of the directory could change between these two passes, so the numbers may be inaccurate.
The program below (written for Python 2.6) illustrates the process.
#!/usr/bin/env python
''' zip all the files in dirname into archive zipname
Use only the last path component in dirname as the
archive directory name for all files
Written by PM 2Ring 2015.02.15
From http://stackoverflow.com/q/28522669/4014959
'''
import sys
import os
import zipfile
def zipdir(zipname, dirname):
#Get total data size in bytes so we can report on progress
total = 0
for root, dirs, files in os.walk(dirname):
for fname in files:
path = os.path.join(root, fname)
total += os.path.getsize(path)
#Get the archive directory name
basename = os.path.basename(dirname)
z = zipfile.ZipFile(zipname, 'w', zipfile.ZIP_DEFLATED)
#Current data byte count
current = 0
for root, dirs, files in os.walk(dirname):
for fname in files:
path = os.path.join(root, fname)
arcname = os.path.join(basename, fname)
percent = 100 * current / total
print '%3d%% %s' % (percent, path)
z.write(path, arcname)
current += os.path.getsize(path)
z.close()
def main():
if len(sys.argv) < 3:
print 'Usage: %s zipname dirname' % sys.argv[0]
exit(1)
zipname = sys.argv[1]
dirname = sys.argv[2]
zipdir(zipname, dirname)
if __name__ == '__main__':
main()
Note that I open the zip file with the zipfile.ZIP_DEFLATED
compression argument; the default is zipfile.ZIP_STORED
, i.e., no compression is performed. Also, zip files can cope with both DOS-style and Unix-style path separators, so you don't need to use backslashes in your archive pathnames, and as my code shows you can just use os.path.join()
to construct the archive pathname.
BTW, in your code you have str(pic)
inside your inner for
loop. In general, it's a bit wasteful re-evaluating a function with a constant argument inside a loop. But in this case, it's totally superfluous, since from your first statement it appears that pic
is already a string.