-1

I need to extract specific information from my data and the summarize it.

I have 246 files that I need to do the same thing.

So I did

 for f in *.vcf; awk -F"\t" 'NR>1 {split($10,a,":");
              count10[a[7]]++}
        END  {for (i in count10)
                if (i>0.25)
                  sum += count10[i];
              print sum }' "$f" > ${f}.txt

I get new files for each old file which contain information I extracted from the old file ( some integers )

I then concatenate the new files by using cat function to produce one final big file

Is there a simpler way to concatenate all files without producing single new files

Jan Shamsani
  • 321
  • 2
  • 5
  • 14
  • Possible duplicate of [awk output first two columns then the minimum value out of the third and fourth columns](http://stackoverflow.com/questions/34780828/awk-output-first-two-columns-then-the-minimum-value-out-of-the-third-and-fourth) – peak Jan 14 '16 at 07:18
  • 1
    Sample input data and expected output would improve the quality of this question. – ghoti Jan 14 '16 at 15:14
  • Without sample input and expected output we'd just be guessing at the best way to do whatever it is you want. – Ed Morton Jan 14 '16 at 16:12

2 Answers2

1

You could change the last line in your code to look like below, it will then keep appending to your FINAL output file as shown below

for f in *.vcf; awk -F"\t" 'NR>1 {split($10,a,":");
          count10[a[7]]++}
    END  {for (i in count10)
            if (i>0.25)
              sum += count10[i];
          print sum }' "$f" >> FINAL.txt

Hope this helps..

vmachan
  • 1,672
  • 1
  • 10
  • 10
0

quick and dirty

rm Summary.txt 2>/dev/null 
for f in *.vcf; awk -F"\t" 'NR>1 {split($10,a,":");
              count10[a[7]]++}
        END  {for (i in count10)
                if (i>0.25)
                  sum += count10[i];
              print sum >> Summary.txt }' "$f"

if you explain a bit more, the first for at shell level could be skipped using only a awk directly

NeronLeVelu
  • 9,908
  • 1
  • 23
  • 43
  • 1
    You need to quote the file to which you're redirecting (`"Summary.txt"`). – ghoti Jan 14 '16 at 15:13
  • and don't unnecessarily put output redirection inside an awk script, put it outside of it: `awk 'script' input >> output`? – Ed Morton Jan 14 '16 at 16:14