I have a 2 GB archive (prefer .zip or .rar) file in parts (let's assume 100 parts x 20MB), and I am trying to find a way to unpack it properly. I started with a .zip archive; I had files like test.zip, test.z01, test.z02...test.99, etc. When I merge them in Python like this:
for zipName in zips:
with open(os.path.join(path_to_zip_file, "test.zip"), "ab") as f:
with open(os.path.join(path_to_zip_file, zipName), "rb") as z:
f.write(z.read())
and then, after merge, unpack it like thod"
with zipfile.ZipFile(os.path.join(path_to_zip_file, "test.zip"), "r") as zipObj:
zipObj.extractall(path_to_zip_file)
I get errors, likr
test.zip file isn't zip file.
So then I tried with a .rar archive. I tried to unpack just the first file to see if my code would intelligently look for and pick up the remaining archive fragments, but it did not. So again I merged the .rar files (just like in the .zip case), and then tried to unpack it by using patoolib
:
patoolib.extract_archive("test.rar", outdir="path here")
When I do that, I get errors like:
patoolib.util.PatoolError: could not find an executable program to extract format rar; candidates are (rar,unrar,7z)
After some work I figured out that these merged files are corrupted (I copied it and try to unpack normally on windows using WinRAR, and encountered some problems). So I tried other ways to merge for example using cat
cat test.part.* >test.rar
, but those don't help.
How can I merge and then unpack these archive files properly in Python?