5

Purpose

Split a zip archive into smaller zip archives with an evenly distributed # of files per new zip.

Example

source zip (100 files)

  • src/100-Test.zip

destination zips (25 files each):

  • destination/1.zip
  • destination/2.zip
  • destination/3.zip
  • destination/4.zip

Description

So I have been able to open the zip file and iterate through the contents to split them up, but I have not been able to write to the file. Since I didn't do anything with the zip contents I didn't think I had to do any StringIO stuff or anything?

Code

zipFileNameSrc = '100-Test.zip'
zipFile = open(zipFileNameSrc)
unzippedFile = zipfile.ZipFile(zipFile)
imgList = [(s, unzippedFile.read(s)) for s in unzippedFile.namelist() if (".jpg" or ".JPG") in s]
#image names: imgList[i][0]  and  images: imgList[i][1]

#...
#...additional logic to split into sets of 25 images
#...fileTuplesList = imgList[:25]
zipNo = 1
#zipFileDest = destination + "/" + zipSrcNm + "/" + zipNo.__str__() + ".zip"
zipFileName = zipNo.__str__() + ".zip"
zipOut = zipfile.ZipFile(zipFileName, 'w')
for i in xrange(len(fileTuplesList)):
    fileNameAndPath = fileTuplesList[i][0]
    actualFile = fileTuplesList[i][1]
    zipOut.write(fileNameAndPath, actualFile)
zipOut.close()
#move_files(zipFileName, zipFileDest)

Error

I get on this on line zipOut.write(fileNameAndPath, actualFile)

OSError: [Errno 2] No such file or directory: '100-Test/17.jpg'

Bonus

How to save the zip file to a different folder than where my script is?

alfredox
  • 4,082
  • 6
  • 21
  • 29

2 Answers2

3

ZipFile.write() expects a filename as first argument, and that file should exist in the system. If it does, that particular file is copied into the zip archive.

You actually want to use - ZipFile.writestr() - it expects the archivename as first argument and data as the second argument.

Also, you can create your zip archives anywhere, just use os.path.join() to join the directory to zip file name when creating the zipFileName . Example code that does what you want -

import os.path
zipFileNameSrc = '100-Test.zip'
zipFile = open(zipFileNameSrc)
unzippedFile = zipfile.ZipFile(zipFile)
imgList = [(s, unzippedFile.read(s)) for s in unzippedFile.namelist() if (".jpg" or ".JPG") in s]
#image names: imgList[i][0]  and  images: imgList[i][1]

#...
#...additional logic to split into sets of 25 images
#...fileTuplesList = imgList[:25]
zipNo = 1
#zipFileDest = destination + "/" + zipSrcNm + "/" + zipNo.__str__() + ".zip"
zipFileName = os.path.join('<directory for zip>',zipNo.__str__() + ".zip")
zipOut = zipfile.ZipFile(zipFileName, 'w')
for i in xrange(len(fileTuplesList)):
    fileNameAndPath = fileTuplesList[i][0]
    actualFile = fileTuplesList[i][1]
    zipOut.writestr(fileNameAndPath, actualFile)
zipOut.close()

Example/Demo code that worked in my system -

import zipfile
import os.path
zipFileNameSrc = 'ziptest.zip'
zipFile = open(zipFileNameSrc, 'rb')
unzippedFile = zipfile.ZipFile(zipFile)
imgList = [(s, unzippedFile.read(s)) for s in unzippedFile.namelist() if (".png" or ".PNG")]
for i in range(1,5):
    zipFileName = os.path.join('<some location>','ziptest_' + str(i) + '.zip')
    print('Creating ', zipFileName)
    zipOut = zipfile.ZipFile(zipFileName, 'w')
    for j in range(25):
        ind = (i-1)*25 + j
        fileNameAndPath = imgList[ind][0]
        actualFile = imgList[ind][1]
        zipOut.writestr(fileNameAndPath, actualFile)
    zipOut.close()
Anand S Kumar
  • 88,551
  • 18
  • 188
  • 176
0

You have the zipOut.write() params backwards. The first argument is the file you wish to write, the second argument is the name you'd like to give the file (you can also leave it blank and it'll just use the filename on its own).

fileNameAndPath = fileTuplesList[i][0]
actualFile = fileTuplesList[i][1]
zipOut.write(fileNameAndPath, actualFile)

https://docs.python.org/3.4/library/zipfile.html#zipfile.ZipFile.write

ZipFile.write(filename, arcname=None, compress_type=None)

Write the file named filename to the archive, giving it the archive name arcname (by default, this will be the same as filename, but without a drive letter and with leading path separators removed). If given, compress_type overrides the value given for the compression parameter to the constructor for the new entry. The archive must be open with mode 'w' or 'a' – calling write() on a ZipFile created with mode 'r' will raise a RuntimeError. Calling write() on a closed ZipFile will raise a RuntimeError.

ApolloFortyNine
  • 570
  • 2
  • 7
  • should have mentioned I already tried that, and got this - TypeError: must be encoded string without NULL bytes, not str – alfredox Aug 04 '15 at 15:34
  • Your path is misformatted. See here: http://stackoverflow.com/questions/12591575/python-typeerror-must-be-encoded-string-without-null-bytes-not-str You have to escape the \'s. Or just don't put it inside a folder and just give the filename. – ApolloFortyNine Aug 04 '15 at 15:39