0

Hi all I am a bit new to grep etc.I did read the man page and searched internet before asking the question here.I am having a server which is running behind chain of reverse proxy servers.An application is giving me problem and I have to analyze the header response all those servers are accessible on SSH consoles so I decided to read Apache logs my problem is I want to have a script which can read the required part of logs from Apache and tell me exact thing.I have to do this on many places so the job becomes difficult. Here is a sample log I have

192.168.1.1 - - [18/Nov/2010:15:24:53 +0530] "GET /appl/pix/i/restore.gif HTTP/1.1" 304 187 "http://somedomain/appl/course/view.php?id=2" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; InfoPath.2)"
192.168.1.1 - - [18/Nov/2010:15:24:53 +0530] "GET /appl/pix/i/group.gif HTTP/1.1" 304 187 "http://somedomain/appl/course/view.php?id=2" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; InfoPath.2)"
192.168.1.1 - - [18/Nov/2010:15:24:53 +0530] "GET /appl/pix/i/return.gif HTTP/1.1" 304 188 "http://somedomain/appl/course/view.php?id=2" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; InfoPath.2)"
192.168.1.1 - - [18/Nov/2010:15:24:53 +0530] "GET /appl/pix/i/stats.gif HTTP/1.1" 304 187 "http://somedomain/appl/course/view.php?id=2" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; InfoPath.2)"
192.168.1.1 - - [18/Nov/2010:15:24:53 +0530] "GET /appl/pix/i/questions.gif HTTP/1.1" 304 187 "http://somedomain/appl/course/view.php?id=2" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; InfoPath.2)"
192.168.1.1 - - [18/Nov/2010:15:24:53 +0530] "GET /appl/pix/i/files.gif HTTP/1.1" 304 187 "http://somedomain/appl/course/view.php?id=2" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; InfoPath.2)"
192.168.1.1 - - [18/Nov/2010:15:24:53 +0530] "GET /appl/pix/i/user.gif HTTP/1.1" 304 187 "http://somedomain/appl/course/view.php?id=2" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; InfoPath.2)"
192.168.1.1 - - [18/Nov/2010:15:24:53 +0530] "GET /appl/pix/i/course.gif HTTP/1.1" 304 187 "http://somedomain/appl/course/view.php?id=2" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; InfoPath.2)"
192.168.1.1 - - [18/Nov/2010:15:24:53 +0530] "GET /appl/pix/t/groupn.gif HTTP/1.1" 304 187 "http://somedomain/appl/course/view.php?id=2" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; InfoPath.2)"

I want to have some thing can take relevant lines from log on the basis of date:time so how do I do a grep on that does any one have any simple suggestion in this situation,the log files are pretty big and I need to get only those responses based on time stamps (which can be ranges from users) which gave errors.

Bond
  • 781
  • 4
  • 12
  • 22

3 Answers3

1
cat apachelog | grep '[18/Nov/2010:15:24'

Should do it.. just change the date/time as you please

Arenstar
  • 3,602
  • 2
  • 25
  • 34
1

I'd recommend learning to punctuate your sentences in addition to learning how to use software tools.

I need to get only those responses based on time stamps (which can be ranges from users) which gave errors

But the log sample you've provided does not contain the HTTP status code - is this from the error log? If not then you need to change your log format to something more informative.

'grep' varies between implementations (you don't say what OS you are running this on) but it almost universally does not provide for implementation of a finite state machine - which you'll need if you can't express the time range as a single regular expression (which is how Arenstar's solution works).

If it were me, I'd use awk - something like:

cat logfiles | awk 'BEGIN { output=0; }
/\[18\/Nov\/2010:15/ {
    output=1;
}
/\[19\/Nov\/2010:02/ {
    output=0;
    exit 0; # ... if no further blocks to be extracted
}
{ if (output) print $0; }'

This would return all the log entries between 3pm on the 18th and 2am on the 19th.

behind chain of reverse proxy servers

erm, really?

symcbean
  • 21,009
  • 1
  • 31
  • 52
  • I liked this script and the other replies you people gave were informative.I do not know python that well but if this can work then this is great. – Bond Nov 20 '10 at 07:31
0

See the Python script I wrote. I prints log entries based on a range of times. It would need to be adapted to work with Apache logs since it was written for syslog log files.

Dennis Williamson
  • 62,149
  • 16
  • 116
  • 151