1

I am trying to get the number of reads for my fastq files, and I wanted the output to also include the name of my files. I've found a solution online that almost works, but still not getting the right output. Example:

My file names:

12S_C-T1-045_F_filt.fastq.gz
12S_C-T1-PL_F_filt.fastq.gz
...

The code I have found:

for file in ./*.fastq.gz
do 
    file_name=$(basename -s .fastq $file)
    printf "$file_name\t$(cat ${file} | wc -l)/4|bc\n" >> no_reads_12S.txt
done

The output:

12S_C-T1-045_F_filt.fastq.gz 114/4|bc
12S_C-T1-PL_F_filt.fastq.gz 26455/4|bc
...

So, clearly is not doing the calculation right--the numbers are not even correct. How should I fix this? I've tried also doing this:

for file in ./*.fastq.gz
do 
    file_name=$(basename -s .fastq.gz $file)
    echo "$file_name"
    echo $(zcat $file | wc -l)/4|bc
done

Which works, but then it gives me the filenames and read numbers in separate rows.

Thanks!

Rachel
  • 73
  • 7

1 Answers1

2

Based on the 2nd script, would you please try:

#!/bin/bash

for file in ./*.fastq.gz; do
    file_name=$(basename -s .fastq.gz "$file")
    printf "%s\t%d\n" "$file_name" "$(echo $(zcat "$file" | wc -l) / 4 | bc)"
done

Or as a one-liner:

for file in ./*.fastq.gz; do file_name=$(basename -s .fastq.gz "$file"); printf "%s\t%d\n" "$file_name" "$(echo $(zcat "$file" | wc -l) / 4 | bc)"; done

As the synopsis of printf is:

printf FORMAT [ARGUMENT]...

we need to feed strings as the arguments. the 1st argument "$file_name" will be obvious. The second argument "$(echo $(zcat "$file" | wc -l) / 4 | bc)" may require explanation. First the command $(zcat "$file" | wc -l) is substituted with the line count as the output of the command pipeline. Then the outer command will look like $(echo <number> / 4 | bc) then it is substituted with the result of bc and passed to printf.

tshiono
  • 21,248
  • 2
  • 14
  • 22
  • thanks for pointing out the error in the file extension. that was a mistake on my part and i edited in the question above! – Rachel Apr 05 '22 at 01:57
  • also, it works! n_n thank you so much! so the first part, after printf is how it's displaying the 2 structures following up? i'm not familiar w/ printf. – Rachel Apr 05 '22 at 02:01
  • Thank you for the feedback. I've added small explanation about the usage of `printf` with command output. Hope it will help. – tshiono Apr 05 '22 at 02:23