4

How can I grep only lines from a huge (120GB) httpd error_log based on a time range, say:

from 2011-11-15 11:30 pm
to   2011-11-16 01:30 am

Thanks!

ohho
  • 1,005
  • 8
  • 19
  • 34
  • http://serverfault.com/questions/296555/shell-script-find-entries-in-access-log-with-500-response-within-a-specified-da – quanta Nov 16 '11 at 04:23
  • Why didn't you use logrotate? – quanta Nov 16 '11 at 04:24
  • it's already rotated. too bad one of the apps generates tons of errors within a day and needed to be extracted for examination ... – ohho Nov 16 '11 at 08:02

2 Answers2

4

You'll probably have to do some drill down, I'd start by getting the date range:

grep -e "2011\-11\-[15-16] " error_log > filtered
grep -v -e "2011\-11\-15 [0-10]:" | grep -v -e "2011\-11\-15 11:[0-29]" > filtered
grep -v -e "2011\-11\-16 [2-23]:" | grep -v -e "2011\-11\-16 01:[31-59]" > filtered

cat filtered

The most efficient way I can think of but haven't done is to find the start and end bytes of your date range and get that; (which is apparently possible with grep) but I dont know how to get a range of bytes from a file - probably takes some awk skills

Edit: Since this was an interesting question - I did some more digging:

You can get the first byte offset by doing:

 # Get first byte offset, leftmost number is the offset...
grep -m 1 -b "2011-11-15 11:3" error_log
 # Get last byte offset
grep -m 1 -b "2011-11-16 01:3" error_log

 #(Subtract first number from last number to get byte length) Then do:

dd if=error_log of=filtered bs=c skip=<first number> count=<last_byte#-first_byte#>
thinice
  • 4,716
  • 21
  • 38
1

awk '$3>"11:30:00" && $3<"13:30:00"' log_file | less

where $3 is the 3rd column of your logfile which is the timestamp, you can use any number as per your logfile.

  • 3
    You should really read the older answers including the one that was accepted as it is much more thorough than the one you have provided. – Brent Pabst Oct 25 '12 at 16:25