3

Currently using the following line to get top 20 ips sorted by requests:

grep 'GET /' /var/log/nginx/access.log | awk '{ print $1 }' | sort -n | uniq -c | sort -rn | head -20

Output:

575 66.249.*.*
570 66.249.*.*
534 207.46.*.*
511 157.55.*.*
493 66.249.*.*
435 207.46.*.*
383 66.249.*.*
378 157.55.*.*
368 66.249.*.*
336 66.249.*.*
334 188.165.*.*
332 174.24.*.*
292 54.209.*.*
251 66.249.*.*
241 66.249.*.*
234 66.249.*.*
226 66.249.*.*
225 89.145.*.*
221 89.145.*.*
209 66.249.*.*

I would like to lookup each ip using "host"

Is it possible to accomplish this in one line?

Thanks

Bastien
  • 123
  • 2
  • 12

2 Answers2

2

Note that in general you should try to avoid superfluous DNS lookups because it can lead to capacity problems with DNS servers and may often prevent caching on the client (so can be much slower too). Use getent hosts to do the lookup.

I have an AWK example on my blog page http://distracted-it.blogspot.co.nz/2015/04/please-dont-use-dig-etc-in-reporting.html

Sample Nice and script-friendly. Let's see how to pull that in AWK (or GAWK), with a simple example first. Let's start off with some input -- perhaps lines with an IP address and some count. I've also include a threshold, just as a reminder that it's good to minimise the number of lookups.

$ echo -e '1.1.1.1 2\n8.8.8.8 12\n8.8.4.4 25' \
  | awk '
    BEGIN {threshold = 5}
    $2 > threshold {
      "getent hosts " $1 | getline getent_hosts_str;
      split(getent_hosts_str, getent_hosts_arr, " ");
      print $1, getent_hosts_arr[2], $3
    }'
8.8.8.8 google-public-dns-a.google.com
8.8.4.4 google-public-dns-b.google.com
Cameron Kerr
  • 4,069
  • 19
  • 25
1

You could do something like this:

awk '/GET / {print $1}' /var/log/nginx/access.log | sort -n | uniq -c | \ 
sort -rn | head -20 | awk '{print $2}' | while read row; do host $row; done

I added a linebreak to make it more redably here.

I removed grep because you can filter with awk directly.

This piece of code: awk '{print $2}' | while read row; do host $row; done will execute the host command on every row (ip-address).

EDIT

This will keep the initial count and order:

awk '/GET / {print $1}' /var/log/nginx/access.log | sort -n | uniq -c | \ 
sort -rn | head -20 | while read row; do z=$( echo $row | awk '{print $2}' ); \
echo "$row $(host $z)"; done

It's not exactly a pretty solution but it does work.

krt
  • 274
  • 1
  • 2
  • 12
  • Thanks! it's working. However, it is possible to add the hostname at the end of the ips and keep the initial count and order? – Bastien May 31 '15 at 17:04
  • @Bastien I'm glad to here it works. I have tweaked the code a bit so it will keep the inital count and order. Try it out. – krt May 31 '15 at 17:36