0

I need to get the number of lines with 2 different conditions for 1 text file. The first condition is that values of the third column are smaller than 10 so I can do it by the following script:

awk '$3<=10' DATA_File | wc -l

The second condition is just to get a total number of lines in the same file this I can get by:

awk 'END { print FNR}' DATA_File

or

awk '$3' DATA_File | wc -l

However, what I don't know is how to merge these to commands in a single string so I can get the result saved in a separate file with one string separated by either "tab" or "space" consisting of "number of string with <10", "total number of strings", "their ratio/ or percentage"

for instance the file is:

wer fre 11
grt o34 5
45f 123 45

the output I need is:

2 3 0.66/ or 66%

I could write a small script on python which would do it but due to a number of reasons bash would be much more convenient.

fedorqui
  • 275,237
  • 103
  • 548
  • 598
Kredo
  • 23
  • 5
  • see I [edit]ed your question formatting it a little bit. Try to use the edit buttons to improve the way your question looks. It is then easier to understand what you want and how you want it to look like! – fedorqui Jun 22 '15 at 08:40
  • Note your desired output says `2 3`, whereas it should be `1 3` (just 1 row has $3<=10) – fedorqui Jun 22 '15 at 08:44
  • Yep you are exactly right. Should be 1 3. The version with printf is a bit better because it shows 0 in a case if there are no values below 10. Thanks a lot for helping out with this stuff!!! – Kredo Jun 23 '15 at 03:46

1 Answers1

3

You can for example say:

$ awk '$3<=10 {min10++} END {print min10, FNR, (FNR?min10/FNR:0)}' file
1 3 0.333333

Or print and output to a file like print ... > "new_file".

You can also use printf to provide a better format:

$ awk '$3<=10 {min10++} END {printf "%d %d %.2f%\n", min10, FNR, (FNR?min10/FNR:0)}' file
1 3 0.33%

The (FNR?min10/FNR:0) trick is courtesy of Ed Morton and is used to prevent diving by zero.

Community
  • 1
  • 1
fedorqui
  • 275,237
  • 103
  • 548
  • 598
  • 1
    Could also do awk `{x+=$3<=10}` – 123 Jun 22 '15 at 08:45
  • 2
    Good one! Looks a bit too hacky to my old eyes, I prefer something more readable :D – fedorqui Jun 22 '15 at 08:46
  • 1
    Fair enough :) I was going to post pretty much the exact same answer but you beat me to it. The only difference was that so I thought i would just add it as a comment :) – 123 Jun 22 '15 at 08:48
  • You might want to make that last part `(NR?min10/NR:0)` or similar to avoid a core dump on an empty file. I'm a little surprised your printf formatting string works as desired since the way to get a literal `%` sign is to double it (`%.2f%%`), otherwise a single `%` is a printf formatting character. Must be a special case...? – Ed Morton Jun 22 '15 at 16:04
  • @EdMorton ah, regarding the `%`, this works fine to me, for example: `awk 'BEGIN {printf "he%llo"}'`, but not with `awk 'BEGIN {printf "he%slo"}'`. So I guess it works if, and only if, the following character is not a [Control letter](http://www.gnu.org/software/gawk/manual/gawk.html#Control-Letters) – fedorqui Jun 22 '15 at 16:09
  • I just tried it on a couple of different awks and it seems to work just fine when followed by `\n` in all of them but with other chars I get varying results. For example `nawk 'BEGIN{printf "%s%k\n", 3}'` prints `3%kk` - no idea why the k (or any other non-formatting letter but not punctuation marks) is getting doubled! – Ed Morton Jun 22 '15 at 16:13
  • @EdMorton in GNU awk I get `3%k` ! – fedorqui Jun 22 '15 at 16:16
  • 1
    Yeah it seems like the behavior is awk-version-dependent. With your example and nawk I get `nawk: not enough args in printf(he%llo) or sprintf(he%llo)` and /usr/xpg4/bin/awk `he/usr/xpg4/bin/awk: line 0 (): insufficient arguments to printf or sprintf` and gawk --posix `gawk: cmd. line:1: fatal: 'l' is not permitted in POSIX awk formats` but gawk without `--posix` functions fine. – Ed Morton Jun 22 '15 at 16:18
  • 1
    @EdMorton so it looks like a POSIX-compliant `printf` has to use `%%` to print `%`. That is, anything that wants to be counted as POSIX must disallow it. Interesting! – fedorqui Jun 23 '15 at 08:29