-2

I have the following script and it has an error. I am trying to merge all the files into one large file. From the command line the cat commant works fine and the content is printed to the redirected file. From script it is working sometime but not the other time. I dont know why its behaving abnormally. Please help.

#!/bin/bash

### For loop starts ###

for D in `find . -type d`
do

        combo=`find $D -maxdepth 1 -type f -name "combo.txt"`
        cat $combo >> bigcombo.tsv

done

Here is the output of bash -x app.sh

++ find . -type d
+ for D in '`find . -type d`'
++ find . -maxdepth 1 -type f -name combo.txt
+ combo=
+ cat
^C

UPDATE: The following worked for me. There was issue with the path. I still dont know what was the issue so answer is welcome.

#!/bin/bash

### For loop starts ###
rm -rf bigcombo.tsv

for D in `find . -type d`
do

                psi=`find $D -maxdepth 1 -type f -name "*.psi_filtered"`
                # This will give us only the directory path from find result i.e. removing filename.
                directory=$(dirname "${psi}")
                cat $directory"/selectedcombo.txt" >> bigcombo.tsv


done
Dr. Mian
  • 3,334
  • 10
  • 45
  • 69
  • 4
    Also see [How to use Shellcheck](https://github.com/koalaman/shellcheck), [How to debug a bash script?](https://unix.stackexchange.com/q/155551/56041) (U&L.SE), [How to debug a bash script?](https://stackoverflow.com/q/951336/608639) (SO), [How to debug bash script?](https://askubuntu.com/q/21136) (AskU), [Debugging Bash scripts](http://tldp.org/LDP/Bash-Beginners-Guide/html/sect_02_03.html), etc. – jww Jul 09 '18 at 16:36
  • 2
    `cat` is working fine, given the unquoted expansion of `$combo`. The main problem is that your `find` command isn't always finding at least one file. You probably just want something like `find . -maxdepth 2 -type f -name "combo.txt" -exec cat {} + > bigcombo.tsv`. – chepner Jul 09 '18 at 17:58
  • 1
    Also, why use back-ticks and the "new" form of command substitution, i.e. `directory=$(dirname "${psi}")`? The `directory=$(...)` form is the prefered form, so join the 90's and stop using back-ticks for command substitution ;-) i.e. `psi=\`find $D -maxdepth 1 -type f -name "*.psi_filtered"\`` should be `psi=$(find $D -maxdepth 1 -type f -name "*.psi_filtered")` . Good luck. – shellter Jul 09 '18 at 18:44

1 Answers1

3

The obvious problem is that you are attempting to cat a file which doesn't exist.

Secondary problems are related to efficiency and correctness. Running two nested loops is best avoided, though splitting the action into two steps is merely inelegant here; the inner loop will only execute once, at most. Capturing command results into variables is a common beginner antipattern; a variable which is only used once can often be avoided, and avoids littering the shell's memory with cruft (and coincidentally solves the multiple problems with missing quoting - a variable which contains a file or directory name should basically always be interpolated in double quotes). Redirection is better performed outside any containing loop;

rm file
while something; do
    another thing >>file
done

will open, seek to the end of the file, write, and close the file as many times as the loop runs, whereas

while something; do
    another thing
done >file

only performs the open, seek, and close actions once, and avoids having to clear the file before starting the loop. Though your script can be refactored to not have any loops at all;

find ./*/ -type f -name "*.psi_filtered" -execdir cat selectedcombo.txt \;> bigcombo.tsv

Depending on your problem, it might be an error for there to be directories which contain combo.txt but which do not contain any *.psi_filtered files. Perhaps you want to locate and examine these directories.

tripleee
  • 175,061
  • 34
  • 275
  • 318
  • I will try your answer later. But first will this part `find ./*/` not make script static. I want to keep it dynamic. Secondly, how it works because traditional programmer will assign the output of find to an array and then loop through the array and cat each file. However here it seems that find will get all files and then run cat although it is working correctly. – Dr. Mian Jul 10 '18 at 07:57
  • 1
    I don't understand this comment. I imagine you wanted to search subdrectories of the current directory, but if you want to include the current directory, too, obviously revert back to `find .` etc etc. This doesn't make this more or less "dynamic", it just runs with slightly different parameters. – tripleee Jul 10 '18 at 11:03
  • 1
    A "traditional" programmer would generally not put static contents in a variable which is only used once, just like I explain in the answer. Generally a pipeline is more efficient and more idiomatic than needlessly capturing results into variables. Properly "traditional" shell script doesn't have array variables at all (they are a `ksh`/Bash extension). – tripleee Jul 10 '18 at 11:04
  • 1
    `find` processes its predicates in order until one of them is false or it runs out of predicates. The `-execdir` predicate switches to the directory which `find` is processing and executes a command (and examines its exit status to determine whether the predicate was successful). The command `cat selectedcombo.txt` copies the contents of the named file to standard output, which the shell has redirected to a file before `find` started. – tripleee Jul 10 '18 at 11:10
  • 1
    If you are coming from a lower-level language like asm or C, try to avoid dividing your problem into smaller parts than necessary. A good shell programming guideline is to try to minimize the number of external processes. If you have experience with mathematical tools with support for vector operations, the shift in thinking is vaguely similar to how you have to learn to take advantage of vectorized operations whenever you can. – tripleee Jul 10 '18 at 11:20