0

I have multiple files in multiple folders

[tiagocastro@cascudo clean_reads]$ ls
11  13  14  16  17  18  3  4  5  6  8  9 

and I want to make a tiny bash script to concatenate these files inside :

11]$ ls
FCC4UE9ACXX-HUMcqqTAAFRAAPEI-206_L6_1.fq  FCC4UE9ACXX-HUMcqqTAAFRAAPEI-206_L7_1.fq
FCC4UE9ACXX-HUMcqqTAAFRAAPEI-206_L6_2.fq  FCC4UE9ACXX-HUMcqqTAAFRAAPEI-206_L7_2.fq

But only L6 with L6 and L7 with L7

Right now I am on the basic level. I want to learn how to do it more smartly, instead of reproduce the commands I could do in terminal in the script.

Thank you everybody, for helping me.

Tiago Bruno
  • 413
  • 1
  • 3
  • 17
  • Need clarification: do you mean that files will be named, `xxxxxx_1.fq` and `xxxxxx_2.fq` for example, and you want these combined into `xxxxxx.fq`? Do all the names strictly follow this pattern? – lurker Sep 05 '14 at 20:06
  • try `cat *_L6_* >newL6` and `cat *_L7_* >newL7` - or here is something more complicated? – clt60 Sep 05 '14 at 20:08
  • yes I could do this: try cat *_L6_* >newL6 and cat *_L7_* >newL7 But how I do it once for all the files inside the folders? I can't get rid of the folders. – Tiago Bruno Sep 05 '14 at 20:11
  • I could do this : try cat *_L6_* >newL6 and cat *_L7_* >newL7 for every folder, but actually I want to learn to write more clever bash scripts. – Tiago Bruno Sep 05 '14 at 20:21
  • You want to, for each folder under `clean_reads`, concat all the `*L6*` files into one file and do the same thing for all the `*L7*` files? Using `cat` is the right idea (assuming you only have files named from 0 to 9, or use padded numbering, because glob is going to sort 10 and 11 after 1 but before 2 otherwise). Then you just have to add a loop over the entries in `clean_reads` around that `cat` command. – Etan Reisner Sep 05 '14 at 20:28

1 Answers1

1

This isn't an free programmiing service, but you can learn something from the next:

#!/bin/bash
echo2() { echo "$@" >&2; }

get_Lnums() {
        find . -type f -regextype posix-extended -iregex '.*_L[0-9]+_[0-9]+\.fq' -maxdepth 1 -print | grep -oP '_\KL\d+' | sort -u
}

docat() {
        echo2 doing $(pwd)
        for lnum in $(get_Lnums)
        do
                echo cat *_${lnum}_*.fq "> new_${lnum}.all"   #remove (comment out) this line when satisfied
                #cat *_${lnum}_*.fq > new_${lnum}.all #and uncomment this
        done
}

while read -r -d $'\0' dir
do
        (cd "$dir" && docat)   #subshell - don't need cd back
done < <(find . -type dir -maxdepth 1 -mindepth 1 -print0)
clt60
  • 62,119
  • 17
  • 107
  • 194