1

I'm looking for a regular expression that I can use to scan for HTTP errors in my standard apache log files.

I'm interesting in matching all lines that don't have an HTTP 200 or HTTP 204 return status.

I can match the lines that do contain either HTTP 204 or HTTP 200 return code

grep 'HTTP[^"]*" 204 \| HTTP[^"]*" 200'

But I would like to have the inverse. I'm also sure the expression above can be optimized.

I need to feed such regular expression to an external program, so using grep -v to inverse it is not an option.

ddewaele
  • 333
  • 1
  • 4
  • 12

2 Answers2

1

The -v switch gives you all the lines that don't match, so:

egrep -v 'HTTP[^"]*" (200|204)'
Andrew Schulman
  • 8,811
  • 21
  • 32
  • 47
1

Ordinary regular expressions don't include a way to negate anything except a single character, so I think you'll have to provide the whole list of codes you do want:

HTTP[^"]*" (1|20[12356]|3|4|5)

Perl-compatible REs do allow you to negate strings of text, so if you were using those you could use

HTTP[^"]*" (?!(200|204))
Andrew Schulman
  • 8,811
  • 21
  • 32
  • 47