0

I have a regexp pattern:

<^(([a-z]+)\:([0-9]+)\/?.*)$>

How do I avoid capturing the primary group?

<^(?:([a-z]+)\:([0-9]+)\/?.*)$>

The above pattern will still put the whole string 'localhost:8080' into the first (0) group. But I need to get only 2 matched groups, so that first (0) group is populated with 'localhost' and second (1) with '8080'.

Where did I make a mistake?

TRiG
  • 10,148
  • 7
  • 57
  • 107
Aleksandr Makov
  • 2,820
  • 3
  • 37
  • 62
  • Why are you using the outermost brackets? – Matt Gibson Feb 10 '12 at 15:39
  • But, why not? They are well supported and my actual pattern isn't expecting any of those chars. – Aleksandr Makov Feb 10 '12 at 15:44
  • Stuff inside brackets can get put into the matches array, which could be causing it were you not using PHP. Here, however, it's just an artefact of how PHP deals with the matching - the first match array item is always the whole thing, as the other posters have pointed out. – Matt Gibson Feb 10 '12 at 15:59
  • you might consider accepting one of the answers as the correct answer if we were able to help you. – Kaii Feb 10 '12 at 16:06

5 Answers5

3

The first group, 0, will always be the entire match.

Bart Kiers
  • 166,582
  • 36
  • 299
  • 288
1

That's just the way the regex functions work. The first group is always the entire match. You can use array_shift to get rid of it.

http://www.php.net/manual/en/function.array-shift.php

kitti
  • 14,663
  • 31
  • 49
1

In a regex $0 is always equal to match string and not one of the groupings. Match groups will always start at $1. So look at $1 and $2 instead of $0 and $1.

Hersha
  • 207
  • 4
  • 14
1

If you are dealing with URLs, you can try using PEAR NetURL, or what might be better for you in this case would be parse-url()

print_r(parse_url($url));

earthmeLon
  • 640
  • 8
  • 20
1

from the docs:

matches

If matches is provided, then it is filled with the results of search. $matches[0] will contain the text that matched the full pattern, $matches[1] will have the text that matched the first captured parenthesized subpattern, and so on.

if you don't care about the full match, you can use array_shift() to remove the unwanted element.

array_shift($matches);
Kaii
  • 20,122
  • 3
  • 38
  • 60