I would use something like this:
awk -F"[ :]" '
{tot[$5]++; if ($(NF-2)==200) succ[$5]++}
END {for (i in tot) printf "%d %d/%d %.2f%\n", i, succ[i], tot[i], succ[i]/tot[i]*100}' file
This defines :
or space as field separators. Hence, the hour is stored in the 5th field and the return code in the one before the penultimate (NF-2
) - we could also use the 12th as you do, but this allows support for longest input in the log.
Then, it keeps track of the amount of lines in the tot[]
array and the successful ones in succ[]
array.
Finally it prints the result
Here we have some more data with different hours and return codes:
$ cat a
10.1.20.123 "1.1.1.1" - [15/Oct/2014:13:14:17 +0000] "POST /server/ad?uid=abc&type=PST HTTP/1.1" 200 3014 10
10.1.20.123 "1.1.1.1" - [15/Oct/2014:13:14:26 +0000] "POST /server/ad?uid=abc&type=PST HTTP/1.1" 200 3014 9
10.1.20.123 "1.1.1.1" - [15/Oct/2014:13:24:55 +0000] "POST /server/ad?uid=abc&type=PST HTTP/1.1" 100 3014 8
10.1.20.123 "1.1.1.1" - [15/Oct/2014:17:24:55 +0000] "POST /server/ad?uid=abc&type=PST HTTP/1.1" 200 3014 8
10.1.20.123 "1.1.1.1" - [15/Oct/2014:17:24:55 +0000] "POST /server/ad?uid=abc&type=PST HTTP/1.1" 404 3014 8
Let's run the script:
$ awk -F"[ :]" '{tot[$5]++; if ($(NF-2)==200) succ[$5]++} END {for (i in tot) printf "%d %d/%d %.2f%\n", i, succ[i], tot[i], succ[i]/tot[i]*100}' a
13 2/3 66.67%
17 1/2 50.00%