0

Let's say /tmp has subdirectories /test1, /test2, /test3 and so on, and each has multiple files inside.

I have to run a while loop or for loop to find the name of the directories (in this case /test1, /test2, ...) and run a command that processes all the files inside of each directory.

So, for example, I have to get the directory names under /tmp which will be test1, test2, ... For each subdirectory, I have to process the files inside of it.

How can I do this?


Clarification:

This is the command that I want to run:

find /PROD/140725_D0/ -name "*.json" -exec /tmp/test.py {} \;

where 140725_D0 is an example of one subdirectory to process - there are multiples, with different names.

So, by using a for or while loop, I want to find all subdirectories and run a command on the files in each.

The for or while loop should iteratively replace the hard-coded name 140725_D0 in the find command above.

mklement0
  • 382,024
  • 64
  • 607
  • 775
Young
  • 43
  • 1
  • 11
  • I've cleaned up your question; if you look a the question's source now, you should be able to infer how code is formatted. – mklement0 Mar 04 '15 at 17:24

5 Answers5

1

You should be able to do with a single find command with an embedded shell command:

find /PROD -type d -execdir sh -c 'for f in *.json; do /tmp/test.py "$f"; done' \;

Note: -execdir is not POSIX-compliant, but the BSD (OSX) and GNU (Linux) versions of find support it; see below for a POSIX alternative.

  • The approach is to let find match directories, and then, in each matched directory, execute a shell with a file-processing loop (sh -c '<shellCmd>').
  • If not all subdirectories are guaranteed to have *.json files, change the shell command to for f in *.json; do [ -f "$f" ] && /tmp/test.py "$f"; done

Update: Two more considerations; tip of the hat to kenorb's answer:

  • By default, find processes the entire subtree of the input directory. To limit matching to immediate subdirectories, use -maxdepth 1[1]:

    find /PROD -maxdepth 1 -type d ...
    
  • As stated, -execdir - which runs the command passed to it in the directory currently being processed - is not POSIX compliant; you can work around this by using -exec instead and by including a cd command with the directory path at hand ({}) in the shell command:

    find /PROD -type d -exec sh -c 'cd "{}" && for f in *.json; do /tmp/test.py "$f"; done' \;
    

[1] Strictly speaking, you can place the -maxdepth option anywhere after the input file paths on the find command line - as an option, it is not positional. However, GNU find will issue a warning unless you place it before tests (such as -type) and actions (such as -exec).

Community
  • 1
  • 1
mklement0
  • 382,024
  • 64
  • 607
  • 775
1

Try the following usage of find:

find . -type d -exec sh -c 'cd "{}" && echo Do some stuff for {}, files are: $(ls *.*)' ';'

Use -maxdepth if you'd like to limit your directory levels.

kenorb
  • 155,785
  • 88
  • 678
  • 743
  • 1
    Kudos for `-maxdepth` and the POSIX-compliant `-execdir` alternative (`-exec` + `cd "{}"` inside shell command). – mklement0 Mar 04 '15 at 17:49
0

You can do this using bash's subshell feature like so

for i in /tmp/test*; do
  # don't do anything if there's no /test directory in /tmp
  [ "$i" != "/tmp/test*" ] || continue

  for j in $i/*.json; do
    # don't do anything if there's nothing to run
    [ "$j" != "$i/*.json" ] || continue

    (cd $i && ./file_to_run)
  done
done

When you wrap a command in ( and ) it starts a subshell to run the command. A subshell is exactly like starting another instance of bash except it's slightly more optimal.

randomusername
  • 7,927
  • 23
  • 50
  • I meant to say that.. lets say you don't know if tmp contains /test*.. you need to check the directory to see what is in there and run each directories. also you don't know what files are in there.. lets say each sub-directories contains few json files.. you need to run all json – Young Mar 04 '15 at 16:35
  • So i think it might be better to use while loop so that I can run the files if there are any sub directories.. and if there are, find that name and run that sub directories – Young Mar 04 '15 at 16:38
  • ok sorry for making confusion.. this is the command that I want to run "find /PROD/140725_D0/ -name "*.json" -exec /tmp/test.py {} \;" and 140725_D0 is one of the sub-directory and there are multiples of that (each name of the directories are different).. so by doing for or while loop i want to find all that directories and run that command.. for or while loop should replace that directory name on the find command – Young Mar 04 '15 at 16:54
  • Two suggestions: double-quote `$i` in `cd $i`; instead of repeating the glob to test if anything matched, use `[ -e "$i ] || continue`, for instance (or, more bash-like, `[[ -e $i ]] || continue`. Better yet, use `[[ -d $i ]]` and `[[ -f $j ]]` to limit matches to the type of interest. – mklement0 Mar 04 '15 at 17:32
0

You can also simply ask the shell to expand the directories/files you need, e.g. using command xargs:

echo /PROD/*/*.json | xargs -n 1 /tmp/test.py

or even using your original find command:

find /PROD/* -name "*.json" -exec /tmp/test.py {} \;

Both command will process all JSON files contained into any subdirectory of /PROD.

davidedb
  • 867
  • 5
  • 12
0

Another solution is to change slightly the Python code inside your script in order to accept and process multiple files. For example, if your script contains something like:

def process(fname):
    print 'Processing file', fname

if __name__ == '__main__':
    import sys
    process(sys.argv[1])

you could replace the last line with:

    for fname in sys.argv[1:]:
        process(fname)

After this simple modification, you can call your script this way:

/tmp/test.py /PROD/*/*.json

and have it process all the desired JSON files.

davidedb
  • 867
  • 5
  • 12