0

I'm using awk to analyze some access log files. I'm currently using the following:

awk '($9 ~ /404/)' access_log | awk '{print $9,$7}' | sort | uniq -c | sort > 404.txt

Which returns all of the 404s in my access log with number of appearances. However, it returns absolutely everything—but I'm only interested in html pages.

How can I modify this to only return values for requests that end in .html?

Vecta
  • 103
  • 2

1 Answers1

1

You can add another regex :

awk '$7 ~ /\.html/ && $9 ~ /404/ {print $9,$7}' access_log | sort | uniq -c | sort > 404.txt
Sylvain Firmery
  • 331
  • 1
  • 4