AWK Threshold Greater Than

Question

I have text files in the folder which look something like:

[13]pkt_size=140
[31]pkt_size=139
[49]pkt_size=139
[67]pkt_size=140
[85]pkt_size=139
[103]pkt_size=139
[121]pkt_size=140
[139]pkt_size=139
[157]pkt_size=139
[175]pkt_size=140
[193]pkt_size=139
[211]pkt_size=139
[229]pkt_size=3660
[253]pkt_size=140
[271]pkt_size=139
[289]pkt_size=139
[307]pkt_size=5164
[331]pkt_size=140
[349]pkt_size=139
[367]pkt_size=139
[385]pkt_size=7512

I want to set threshold=1000, then I want script to sum every 10 lines in the file , then if the sum is > threshold then print the output.

But I want to run that script for folder and script must create individual file of output.

Which number is to be added? Which number is to be thresholded? The one in `[]` or the one after the `=`? — Mark Setchell, May 07 '15 at 17:07
Is it 'sum of lines 1..10; sum of lines 11..20; ...' or 'sum of lines 1..10; sum of lines 2..11; ...' (rolling sum or simple sum). Either way, it doesn't look very hard. What's supposed to happen if you have just 9 entries? What is the output supposed to look like when the threshold is exceeded? What have you tried? The 'script for folder' part sounds like a shell script running the awk script -- also very straight-forward. — Jonathan Leffler, May 07 '15 at 17:52
I have tried this so far .."for file in /path/to/files/*.mp4; do ffprobe -show_frames $file | grep "pkt_size" > ${file}.txt'done'" — Owl, May 07 '15 at 19:52

score 1 · Accepted Answer · answered May 07 '15 at 20:52

This script would process the sum as every 10 lines and print the result if over 1000:

$ cat sum.awk 
BEGIN {
    FS = "="
}
{ acc += $2 }
(NR % 10) == 0 { if (acc > 1000) { print acc } acc = 0; }
$ awk -f sum.awk yourfile.txt 
1394
9938
$

If you want the 1000 threshold to be a parameter, I let you choose how to pass paremeters to awk. For instance you can use the -v var=val in the command line as described here: https://www.gnu.org/software/gawk/manual/gawk.html#Options

About running the command for every file and produce an output file, here xargs comes to the rescue. See this sample here:

$ ls
sum.awk  yourfile.txt  zzzzzzz.txt
$ ls *.txt
yourfile.txt  zzzzzzz.txt
$ ls *.txt | xargs -L 1 -I {} /bin/bash -c 'awk -f sum.awk {} > {}.output'
$ ls
sum.awk  yourfile.txt  yourfile.txt.output  zzzzzzz.txt  zzzzzzz.txt.output
$

xargs will run the command for every line in the input. By default it will try to group several lines in each execution, but we will prevent that with the -L 1 setting.

Next we use the -I {} argument to declare a placeholder string {} that will be the each line (the filename).

Finally: execute the /bin/bash -c '<what to execute>' to run the awk script on our file and redirect the output.

Hope it helps.

Hi Ramon, Thanks for your script. It might be silly of me ! But when I start typing what you wrote by starting with "cat sum.awk" and press enter - I get an an error of 'no such file or directory' — Owl, May 07 '15 at 22:21
```cat``` simply displays the contents of the file. Just to show you that I put the script in a file and what the script looked like. You can create the file with your favorite text editor. — Ramón Gil Moreno, May 08 '15 at 08:04

AWK Threshold Greater Than

1 Answers1