0

I have a regex pattern that filters out some specific values.

I tried running the pattern in PHP but it always returns NULL for values.

$re1='^(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}).*?"(.*?/p/.*?,\d+,(\d+).*?)" "(\d+)" "(\d+)".*$';


preg_match($re1, $current_line, $matches);


var_dump($matches);

Sample $current_line variable value-

122.99.152.202 - naveen [22/Nov/2013:13:24:40 +1300] "GET /p/bhYg_TohdFLAxXoNBgIEbg,1385079896,119118112/12.txt HTTP/1.1" "302" "160" "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.57 Safari/537.36" "-" 

where am I doing wrong?

Naveen Gamage
  • 1,844
  • 12
  • 32
  • 51
  • 3
    php requires delimeters for regexes: http://php.net/manual/en/regexp.reference.delimiters.php – Jonathan Kuhn Nov 22 '13 at 00:36
  • Nginx uses a compatible format with Apache combined format – mcuadros Nov 22 '13 at 00:50
  • @mcuadros I have a custom log format. – Naveen Gamage Nov 22 '13 at 00:57
  • @JonathanKuhn : I tried replacing `/p/` with `\/p\/` but still returns NULL – Naveen Gamage Nov 22 '13 at 00:58
  • 2
    A delimeter is the first and last characters, what you are talking about is escaping. A delimeter is what tells php where the start and end of the regular expression are at. Typically the forward slash (`/`) is used, but you can really use any non word character (with other popular characters being `@` and `~`). The point of using a delimeter other than `/` is that you then don't need to escape the `/` that are in the regex. **TLDR**, you can just put something like an `@` as the first and last character and it *should* work (I haven't tested the rest to see if it works). – Jonathan Kuhn Nov 22 '13 at 01:03
  • @JonathanKuhn Great explanation. I just had to add `{` and `}` symbols. thanks a lot Jonathan Kuhn. – Naveen Gamage Nov 22 '13 at 01:12

1 Answers1

2

You have no delimiters set in place for your regular expression.

A delimiter can be any non-alphanumeric, non-backslash, non-whitespace character.

You want to use a delimiter besides / so you can avoid having to escape /s in your pattern:

$re1 = '~^(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}).*?"(.*?/p/.*?,\d+,(\d+).*?)" "(\d+)" "(\d+)".*$~';

See working demo.

You can compact this a little bit if you wanted to.

$re1 = '~^((?:\d{1,3}\.?){4}).*?"(.*?/p/.*?,\d+,(\d+).*?)" "(\d+)" "(\d+)".*$~i';
Amal Murali
  • 75,622
  • 18
  • 128
  • 150
hwnd
  • 69,796
  • 4
  • 95
  • 132