1

I have the following directory, in the parent dir there are several folders lets say ABCD and within each folder many zips with names as displayed and the letter of the parent folder included in the name along with other info:

-parent--A-xxxAxxxx_timestamp.zip
          -xxxAxxxx_timestamp.zip
          -xxxAxxxx_timestamp.zip
       --B-xxxBxxxx_timestamp.zip
          -xxxBxxxx_timestamp.zip
          -xxxBxxxx_timestamp.zip
       --C-xxxCxxxx_timestamp.zip
          -xxxCxxxx_timestamp.zip
          -xxxCxxxx_timestamp.zip
       --D-xxxDxxxx_timestamp.zip
          -xxxDxxxx_timestamp.zip
          -xxxDxxxx_timestamp.zip

I need to unzip only selected zips in this tree and place them in the same directory with the same name without the .zip extension.

Output:

-parent--A-xxxAxxxx_timestamp
          -xxxAxxxx_timestamp
          -xxxAxxxx_timestamp
       --B-xxxBxxxx_timestamp
          -xxxBxxxx_timestamp
          -xxxBxxxx_timestamp
       --C-xxxCxxxx_timestamp
          -xxxCxxxx_timestamp
          -xxxCxxxx_timestamp
       --D-xxxDxxxx_timestamp
          -xxxDxxxx_timestamp
          -xxxDxxxx_timestamp

My effort:

for path in glob.glob('./*/xxx*xxxx*'): ##walk the dir tree and find the files of interest

    zipfile=os.path.basename(path) #save the zipfile path
    zip_ref=zipfile.ZipFile(path, 'r') 
    zip_ref=extractall(zipfile.replace(r'.zip', '')) #unzip to a folder without the .zip extension

The problem is that i dont know how to save the A,B,C,D etc to include them in the path where the files will be unzipped. Thus, the unzipped folders are created in the parent directory. Any ideas?

Anand S Kumar
  • 88,551
  • 18
  • 188
  • 176
balalaika
  • 904
  • 4
  • 10
  • 17
  • Instead of trying to do it in one go , first get list of all folders inside `.` , then get list of all files inside each folder and check whether folder name occurs in it. – Anand S Kumar Aug 17 '17 at 10:53

2 Answers2

2

The code that you have seems to be working fine, you just to make sure that you are not overriding variable names and using the correct ones. The following code works perfectly for me

import os
import zipfile
import glob

for path in glob.glob('./*/xxx*xxxx*'): ##walk the dir tree and find the files of interest

    zf = os.path.basename(path) #save the zipfile path
    zip_ref = zipfile.ZipFile(path, 'r') 
    zip_ref.extractall(path.replace(r'.zip', '')) #unzip to a folder without the .zip extension
BlueEagle
  • 96
  • 2
  • 8
1

Instead of trying to do it in a single statement , it would be much easier and more readable to do it by first getting list of all folders and then get list of files inside each folder. Example -

import os.path
for folder in glob.glob("./*"):
    #Using *.zip to only get zip files
    for path in glob.glob(os.path.join(".",folder,"*.zip")):
        filename = os.path.split(path)[1]
        if folder in filename:
            #Do your logic
Anand S Kumar
  • 88,551
  • 18
  • 188
  • 176