0

I need to os.walk from my parent path (tutu), by all subfolders. For each one, each of the deepest subfolders have the files that i need to process with my code. For all the deepest folders that have files, the file 'layout' is the same: one file *.adf.txt, one file *.idf.txt, one file *.sdrf.txt and one or more files *.dat., as pictures shown. enter image description here My problem is that i don't know how to use the os module to iterate, from my parent folder, to all subfolders sequentially. I need a function that, for the current subfolder in os.walk, if that subfolder is empty, continue to the sub-subfolder inside that subfolder, if it exists. If exists, then verify if that file layout is present (this is no problem...), and if it is, then apply the code (no problem too). If not, and if that folder don't have more sub-folders, return to the parent folder and os.walk to the next subfolder, and this for all subfolders into my parent folder (tutu). To resume, i need some function like that below (written in python/imaginary code hybrid):

for all folders in tutu:
    if os.havefiles in os.walk(current_path):#the 'havefiles' don´t exist, i think...
        for filename in os.walk(current_path):
            if 'adf' in filename:
                etc...
                #my code
    elif:
        while true:
            go deep
    else:
        os.chdir(parent_folder)

Do you think that is best a definition to call in my code to do the job?

this is the code that i've tried to use, without sucess, of course:

import csv
import os
import fnmatch

abs_path=os.path.abspath('.')
for dirname, subdirs, filenames in os.walk('.'):
    # print path to all subdirectories first.
    for subdirname in subdirs:
        print os.path.join(dirname, subdirname), 'os.path.join(dirname, subdirname)'
        current_path= os.path.join(dirname, subdirname)
        os.chdir(current_path)
        for filename in os.walk(current_path):
            print filename, 'f in os.walk'
            if os.path.isdir(filename)==True:
                break
            elif os.path.isfile(filename)==True:
                print filename, 'file'
        #code here

Thanks in advance...

BioInfoPT
  • 53
  • 3

2 Answers2

0

I need a function that, for the current subfolder in os.walk, if that subfolder is empty, continue to the sub-subfolder inside that subfolder, if it exists.

This doesn't make any sense. If a folder is empty, it doesn't have any subfolders.

Maybe you mean that if it has no regular files, then recurse into its subfolders, but if it has any, don't recurse, and instead check the layout?

To do that, all you need is something like this:

for dirname, subdirs, filenames in os.walk('.'):
    if filenames:
        # can't use os.path.splitext, because that will give us .txt instead of .adf.txt
        extensions = collections.Counter(filename.partition('.')[-1] 
                                         for filename in filenames)
        if (extensions['.adf.txt'] == 1 and extensions['.idf.txt'] == 1 and
            extensions['.sdrf.txt'] == 1 and extensions['.dat'] >= 1 and
            len(extensions) == 4):
            # got a match, do what you want

        # Whether this is a match or not, prune the walk.
        del subdirs[:]

I'm assuming here that you only want to find directories that have exactly the specified files, and no others. To remove that last restriction, just remove the len(extensions) == 4 part.

There's no need to explicitly iterate over subdirs or anything, or recursively call os.walk from inside os.walk. The whole point of walk is that it's already recursively visiting every subdirectory it finds, except when you explicitly tell it not to (by pruning the list it gives you).

abarnert
  • 354,177
  • 51
  • 601
  • 671
0

os.walk will automatically "dig down" recursively, so you don't need to recurse the tree yourself.

I think this should be the basic form of your code:

import csv
import os
import fnmatch

directoriesToMatch = [list here...]
filenamesToMatch = [list here...]

abs_path=os.path.abspath('.')
for dirname, subdirs, filenames in os.walk('.'):
    if len(set(directoriesToMatch).difference(subdirs))==0:     # all dirs are there
        if len(set(filenamesToMatch).difference(filenames))==0: # all files are there
            if <any other filename/directory checking code>:
                # processing code here ...

And according to the python documentation, if you for whatever reason don't want to continue recursing, just delete entries from subdirs: http://docs.python.org/2/library/os.html

If you instead want to check that there are NO sub-directories where you find your files to process, you could also change the dirs check to:

    if len(subdirs)==0: # check that this is an empty directory

I'm not sure I quite understand the question, so I hope this helps!

Edit:

Ok, so if you need to check there are no files instead, just use:

    if len(filenames)==0:

But as I stated above, it would probably be better to just look FOR specific files instead of checking for empty directories.

David
  • 328
  • 4
  • 10