-1

I have a root folder that has a structure as

root
    A 
    |-30
       |-a.txt
       |-b.txt 
    |-90
       |-a.txt
       |-b.txt  
    B
    |-60
       |-a.txt
       |-b.txt  
    C
    |-200
       |-a.txt
       |-b.txt  
    |-300 
       |-a.txt
       |-b.txt 

I want to find all subfolders (A,B,C,D) such that the subfolder of the subfolder (likes 30,90,...) smaller than 60. Then copy all file with the name is a.txt in the subfolder to another directory. and the output likes

root_filter
    A 
    |-30
       |-a.txt
    B
    |-60
       |-a.txt

I am using python but I cannot obtain the result

dataroot = './root'
for dir, dirs, files in os.walk(dataroot):
    print (dir,os.path.isdir(dir))
KimHee
  • 728
  • 2
  • 12
  • 22
  • 1
    why check dir? it is always a dir ... you do not check any files name, you convert nothing to a number, ..., you do not copy at all, you do not do what you need to achieve your task. What ist your specific problem? As is we would need to code all for you... and we generally do not do this. Please see the [ask] help page and [The perfect question](http://codeblog.jonskeet.uk/2010/08/29/writing-the-perfect-question/) blog post by Jon Skeet and [edit] your code to a specific problem we _can_ answer. – Patrick Artner Mar 29 '19 at 22:24
  • Problem is I cannot find the sub folder likes A and 30. How to find it? – KimHee Mar 29 '19 at 22:28
  • Read the documentation of os.walk - `dirs` is a list of subdirectories of dataroot - and it steps into all dirs in it recursivly ... `dir` holds the dir you are currently watching at - thats why it is normally called root. dirs and files are lists - print them. – Patrick Artner Mar 29 '19 at 22:30

1 Answers1

1

Essentially, you just need to run a few checks on where you are, before copying the file. This can be accomplished with a number of clauses in a simple if statement inside the for loop:

import shutil
import os

dataroot = './root'
target_dir = './some_other_dir'
for dir, dirs, files in os.walk(dataroot):
    # First, we want to check if we're at the right level of directory
    #   we can do this by counting the number of '/' in the name of the directory
    #   (the distance from where we started). For ./root/A/30 there should be 
    #   3 instances of the character '/'
    # Then, we want to ensure that the folder is numbered 60 or less.
    #   to do this, we isolate the name of *this* folder, cast it to an int,
    #   and then check its value
    # Finally, we check if 'a.txt' is in the directory
    if dir.count('/') == 3 and int(dir[dir.rindex('/')+1:]) <= 60 and 'a.txt' in files:
        shutil.copy(os.path.join(dir, 'a.txt'), target_dir)

You'll need to work something up to name the files when you copy them to target_dir, so they don't overwrite each other. That depends on your use case.

Green Cloak Guy
  • 23,793
  • 4
  • 33
  • 53
  • Great idea. Thanks. Can we automatically count the number of `/`. For example if the root is `/home/hee/data/root' then it must be change the number – KimHee Mar 29 '19 at 22:36
  • 1
    You could maybe replace `3` with `(dataroot.count('/') + 2)`, since you want 2 levels down from your root directory. But in general `os.walk()` works with relative directories - if you put in `'./root'` then it'll spit out directories that start with `./root/...`. – Green Cloak Guy Mar 29 '19 at 22:37
  • As another sidenote, this code may not work on windows since windows uses backslashes instead. In that case you could change `'/'` to `os.sep`, which stores the "separator character" used by the OS you're currently running on – Green Cloak Guy Mar 29 '19 at 22:39
  • Thanks. I got the error `ValueError: invalid literal for int() with base 10: ` – KimHee Mar 29 '19 at 22:43
  • I think the line `int(dir[dir.index('/')+1:])`. The full error is ValueError: invalid literal for int() with base 10: root/A/30 – KimHee Mar 29 '19 at 22:48
  • oh, whoops. Should be using `rindex()` to get the *last* index of the character instead of `index()` to get the first index. I'll change my solution. – Green Cloak Guy Mar 29 '19 at 22:49