How to count requests per hour?

Question

I have access log file with data only per 1 day like:

10.2.21.120 "-" - [26/Jan/2013:19:15:11 +0000] "GET /server/ad?uid=abc&type=PST HTTP/1.1" 202 14 10
10.2.21.120 "-" - [26/Jan/2013:19:17:22 +0000] "GET /server/ad?uid=abc&type=PST HTTP/1.1" 204 14 9
10.2.22.130 "-" - [26/Jan/2013:19:27:53 +0000] "GET /server/ad?uid=abc&type=PST HTTP/1.1" 200 14 8

I am using the following command:

awk '$9 == 200 { s++ } END { print s / NR * 100; }' access.log

In your example date fields the time zone is always `+00`, is it possible to have different time zones or the logging used just UTC? — gboffi, Nov 21 '14 at 11:52

score 3 · Accepted Answer · edited Jun 20 '20 at 09:12

3

This awk may help you

$ awk -F[:\ ]  '{count[$5]++}; $12 == 200 { hour[$5]++} END { for (i in hour) print i, hour[i]/count[i]*100 }' input

Test

$ cat input
10.1.20.123 "1.1.1.1" - [15/Oct/2014:12:14:17 +0000] "POST /server/ad?uid=abc&type=PST HTTP/1.1" 200 3014 10
10.1.20.123 "1.1.1.1" - [15/Oct/2014:12:14:17 +0000] "POST /server/ad?uid=abc&type=PST HTTP/1.1" 100 3014 10
10.1.20.123 "1.1.1.1" - [15/Oct/2014:13:14:26 +0000] "POST /server/ad?uid=abc&type=PST HTTP/1.1" 200 3014 9
10.1.20.123 "1.1.1.1" - [15/Oct/2014:13:24:55 +0000] "POST /server/ad?uid=abc&type=PST HTTP/1.1" 200 3014 8
$  awk -F[:\ ]  '{count[$5]++}; $12 == 200 { hour[$5]++} END { for (i in hour) print i, hour[i]/count[i]*100 }' input
12 50
13 100

What it does

{count[$5]++} array count stores the number of occurence of each hour from the log file.
$12 == 200 { hour[$5]++} Now it the log is success, that is $12 == 200 then the corresponding value in hour array is incremented.

So count[13] will contain total enteries from hour 13 where as hour[13] would contain count of succuessfull entries
END { for (i in hour) print i, hour[i]/count[i]*100 } prints the hour, percentage

edited Jun 20 '20 at 09:12

Community

1
1

answered Nov 21 '14 at 11:48

nu11p01n73R

26,397
3
39
52

Almost the same answer within few seconds, this time I was a bit fater :D – fedorqui Nov 21 '14 at 11:59
@fedorqui :D +1 for you. should i remove the answer? – nu11p01n73R Nov 21 '14 at 12:01
They are pretty much alike, but yours has some different insights. So what *I* would do is to write a description of the logic you applied to it, because maybe it is more useful to the OP than mine. The more answers, the better to give knowledge to the site. +1 for you too! – fedorqui Nov 21 '14 at 12:03
1

@fedorqui will add a description to the answer :) – nu11p01n73R Nov 21 '14 at 12:06
1

Please, don't remove your answer) – Nov 21 '14 at 12:22
@SergeyI. Hehe I was about the remove the answer though :P Glad to hear that it helped you !!!! – nu11p01n73R Nov 21 '14 at 12:30
Anyway to sort output hours? @nu11p01n73R my output is https://i.imgur.com/FdLJKpQ.png unsorted. – akikara Jun 26 '20 at 09:22

fedorqui · Answer 2 · 2014-11-21T11:54:54.997

I would use something like this:

awk -F"[ :]" '
    {tot[$5]++; if ($(NF-2)==200) succ[$5]++}
    END {for (i in tot) printf "%d %d/%d %.2f%\n", i, succ[i], tot[i], succ[i]/tot[i]*100}' file

This defines : or space as field separators. Hence, the hour is stored in the 5th field and the return code in the one before the penultimate (NF-2) - we could also use the 12th as you do, but this allows support for longest input in the log.

Then, it keeps track of the amount of lines in the tot[] array and the successful ones in succ[] array.

Finally it prints the result

Here we have some more data with different hours and return codes:

$ cat a
10.1.20.123 "1.1.1.1" - [15/Oct/2014:13:14:17 +0000] "POST /server/ad?uid=abc&type=PST HTTP/1.1" 200 3014 10
10.1.20.123 "1.1.1.1" - [15/Oct/2014:13:14:26 +0000] "POST /server/ad?uid=abc&type=PST HTTP/1.1" 200 3014 9
10.1.20.123 "1.1.1.1" - [15/Oct/2014:13:24:55 +0000] "POST /server/ad?uid=abc&type=PST HTTP/1.1" 100 3014 8
10.1.20.123 "1.1.1.1" - [15/Oct/2014:17:24:55 +0000] "POST /server/ad?uid=abc&type=PST HTTP/1.1" 200 3014 8
10.1.20.123 "1.1.1.1" - [15/Oct/2014:17:24:55 +0000] "POST /server/ad?uid=abc&type=PST HTTP/1.1" 404 3014 8

Let's run the script:

$ awk -F"[ :]" '{tot[$5]++; if ($(NF-2)==200) succ[$5]++} END {for (i in tot) printf "%d %d/%d %.2f%\n", i, succ[i], tot[i], succ[i]/tot[i]*100}' a
13 2/3 66.67%
17 1/2 50.00%

score 0 · Answer 3 · answered Nov 21 '14 at 12:16

% cat success.awk
NF==11{
    split($4,a,":") ; hour = a[2]
    total[hour] += 1
    if($9==200) success[hour] += 1
    }
END{
    for(h in total) print h, 100*(0+success[h])/total[h]"% ("0+success[h]"/"total[h]")"
    }
% awk -f success.awk mylog
13 66.6667% (2/3)
17 50% (1/2)
%

How to count requests per hour?

3 Answers3