9

Using Bash, how can you traverse folders within specified folder, find all files of specified file type, and every time you find a file, get full file path with file name and full file path without file name as a variables and pass them to another Bash script, execute it, and continue searching for the next file?

  • 3
    in Unix, there are no "filetypes". if you wish, you can put mnemonic contraptions in the name, like ".txt", ".xml", ".unicode"; but that's just part of the name, and only for the benefit of the user. – Javier Jul 15 '09 at 15:18

5 Answers5

18

Assuming a GNU find (which is not unreasonable) you can do this using just find:

find /path -type f -name '*.ext' -exec my_cool_script \{\} \;
unwind
  • 391,730
  • 64
  • 469
  • 606
  • 1
    What is the purpose of \{\} there? Any advantage over {} or '{}'? – ustun Jul 16 '09 at 12:13
  • That is a bit of extra safety against the shell. The manual page for find states: "Both of these constructions might need to be escaped (with a `\') or quoted to protect them from expansion by the shell." – unwind Jul 16 '09 at 12:40
  • the problem with that is that find with exec do not correctly handle long list of files. I've in something like "line too long". See my detailed answer ... – neuro Jul 16 '09 at 14:51
  • @neuro: Really? `-exec ;` executes one-at-a-time, and if that command line is too long, you're pretty screwed with either `find` or `xargs`. Now, GNU findutils has had `-exec +` bugs in the past, but they should be resolved now, and the difference between `find -exec +` versus `find -print0 | xargs -0` is pretty minimal. – ephemient Jul 16 '09 at 20:56
  • @ephemient : Well I've had this sort of errors in the past, so I've used xargs since then. I probably hit some bug at the time. I used to use -exec but I found that using xargs is more readable than -exec and the escape sequences. The -print0 | xargs -0 is a magical way to handle the spaces in names particularly when you use your script in a windows/mingw/cygwin environment ... – neuro Oct 21 '09 at 16:58
6

find is the way. Using xargs handle long list of files/dirs. Moreover to handle correctly names with spaces and problem like that, the best find line command I've found is :

find ${directory} -name "${pattern}" -print0 | xargs -0 ${my_command}

The trick is the find -print0 that is compatible with the xargs -0 : It replace endlines by '\0' to correctly handle spaces and escape characters. Using xargs spares you some "line too long" message when your filelist is too long.

You can use xargs with --no-run-if-empty to handle empty lists and --replace to manage complex commands.

neuro
  • 14,948
  • 3
  • 36
  • 59
2

If you have GNU Parallel http://www.gnu.org/software/parallel/ installed you can do this:

find . -name '*.ext' | parallel echo {} '`dirname {}`'

Substitute echo with your favorite bash command and ext with the file extension you are looking for.

Watch the intro video for GNU Parallel to learn more: http://www.youtube.com/watch?v=OpaiGYxkSuQ

Ole Tange
  • 31,768
  • 5
  • 86
  • 104
0

looks very much like homework.

find /path -type f -name "*.ext" -printf "%p:%h\n" | while IFS=: read a b
do
   # execute your bash script here
done

read the man page of find for more printf options....

ghostdog74
  • 327,991
  • 56
  • 259
  • 343
  • That doesn't seem quite right... if `%p:%h` has no spaces, `read a b` will put it all in `a` and nothing in `b`, unless you set `IFS=:`. – ephemient Jul 15 '09 at 15:04
  • decided to use back : since there might be files with spaces – ghostdog74 Jul 15 '09 at 15:07
  • There can easily be files with colons too. I would use `'%p\n%h\n'` and `read a && read b`, but that still fails if there are filenames with embedded newlines. – ephemient Jul 15 '09 at 15:20
  • wow, then there can be many possibilities of file names with special characters , from the way you say it. :) – ghostdog74 Jul 15 '09 at 15:40
  • 1
    In UNIX, any byte sequence (possibly with a length restriction) not containing NUL ("\0") or '/' is acceptable for a file name, even if they are not printable characters. – ephemient Jul 15 '09 at 17:21
  • 1
    Beware of escape characters. Use find -print0 with xargs -0. see my more precise answer ... – neuro Jul 16 '09 at 14:47
-1

As some have already mentioned, Linux/UNIX has no mandatory file extensions. However, on UNIX based systems the type of many commonly encountered files can be determined by the file command. The following example employs the file command to pass the name and type or each file to my_script:

    find /path -type f | xargs file | while read -r line ; do my_script $line; done

The xargs command has been used to minimise the number of exec's of the file command.

The type info will be quite extensive, as can be seen in the following sample output from the xargs file step:

   /usr/bin/easy_install-3.8:                   Python script, ASCII text executable
   /usr/bin/splain:                             Perl script text executable
   /usr/bin/nvzoom:                             ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=e2295b7d737246db5a2986c790452139793bdbfc, for GNU/Linux 3.2.0, stripped

The other requirements around extracting the path from the filename could be done inside my_script or if it has to be done prior to my_script, by using bash string extraction or sed/awk/perl to reformat the output.

Should you expect any filenames with spaces or other special characters in them, then as suggested in previous answers, you can use -print0:

    find /path -type f -print0 | xargs -0 file | while read -r line ; do my_script "$line"; done